scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-24 00:32:15 +00:00

Author	SHA1	Message	Date
Avi Kivity	611918056a	Merge 'repair: Add tablet incremental repair support' from Asias He The central idea of incremental repair is to allow repair participants to select and repair only a portion of the dataset to speed up the repair process. All repair participants must utilize an identical selection method to repair and synchronize the same selected dataset. There are two primary selection methods: time-based and file-based. The time-based method selects data within a specified time frame. It is versatile but it is less efficient because it requires reading all of the dataset and omitting data beyond the time frame. The file-based method selects data from unrepaired SSTables and is more efficient because it allows the entire SSTable to be omitted. This document patch implements the file-based selection method. Incremental repair will only be supported for tablet tables; it will not be supported for vnode tables. On one hand, the legacy vnode is less important to support. On the other hand, the incremental repair for vnode is much harder to implement. With vnodes, a SSTalbe could contain data for multiple vnode ranges. When a given vnode range is repaired, only a portion of the SSTable is repaired. This complicates the manipulation of SSTables significantly during both repair and compaction. With tablets, an entire tablet is repaired so that a sstable is either fully repaired or not repaired which is a huge simplification. This patch uses the repaired_at from sstables::statistics component to mark a sstable as repaired. It uses a virtual clock as the repair timestamp, i.e., using a monotonically increasing number for the repaired_at field of a SSTable and sstables_repaired_at column in system.tablets table. Notice that when a sstable is not repaired, the repaired_at field will be set to the default value 0 by default. The being_repaired in memory field of a SSTable is used to explicitly mark that a SSTable is being selected. The following variables are used for incremental repair: The repaired_at on disk field of a SSTable is used. - A 64-bit number increases sequentially The sstables_repaired_at is added to the system.tablets table. - repaired_at <= sstables_repaired_at means the sstable is repaired The being_repaired in memory field of a SSTable is added. - A repair UUID tells which sstable has participated in the repair Initial test results: 1) Medium dataset results Node amount: 3 Instance type: i4i.2xlarge Disk usage per node: ~500GB Cluster pre-populated with ~500GB of data before starting repairs job. Results for Repair Timings: The regular repair run took 210 mins. Incremental repair 1st run took 183 mins, 2nd and 3rd runs took around 48s The speedup is: 183 mins / 48s = 228X 2) Small dataset results Node amount: 3 Instance type: i4i.2xlarge Disk usage per node: ~167GB Cluster pre-populated with ~167GB of data before starting the repairs job. Regular repair 1st run took 110s, 2nd and 3rd runs took 110s. Incremental repair 1st run took 110 seconds, 2nd and 3rd run took 1.5 seconds. The speedup is: 110s / 1.5s = 73X 3) Large dataset results Node amount: 6 Instance type: i4i.2xlarge, 3 racks 50% of base load, 50% read/write Dataset == Sum of data on each node Dataset Non-incremental repair (minutes) 1.3 TiB 31:07 3.5 TiB 25:10 5.0 TiB 19:03 6.3 TiB 31:42 Dataset Incremental repair (minutes) 1.3 TiB 24:32 3.0 TiB 13:06 4.0 TiB 5:23 4.8 TiB 7:14 5.6 TiB 3:58 6.3 TiB 7:33 7.0 TiB 6:55 Fixes #22472 Closes scylladb/scylladb#24291 * github.com:scylladb/scylladb: replica: Introduce get_compaction_reenablers_and_lock_holders_for_repair compaction: Move compaction_reenabler to compaction_reenabler.hh topology_coordinator: Make rpc::remote_verb_error to warning level repair: Add metrics for sstable bytes read and skipped from sstables test.py: Disable incremental for test_tombstone_gc_for_streaming_and_repair test.py: Add tests for tablet incremental repair repair: Add tablet incremental repair support compaction: Add tablet incremental repair support feature_service: Add TABLET_INCREMENTAL_REPAIR feature tablet_allocator: Add tablet_force_tablet_count_increase and decrease repair: Add incremental helpers sstable: Add being_repaired to sstable sstables: Add set_repaired_at to metadata_collector mutation_compactor: Introduce add operator to compaction_stats tablet: Add sstables_repaired_at to system.tablets table test: Fix drain api in task_manager_client.py	2025-08-19 13:13:22 +03:00
Botond Dénes	660ea9202a	docs/dev/tombstone.md: document the memtable overlap check elision optimization	2025-08-11 17:20:12 +03:00
Asias He	5377f87e5a	tablet: Add sstables_repaired_at to system.tablets table It is used to store the repaired_at for each tablet.	2025-08-11 10:10:07 +08:00
Pavel Emelyanov	5fcdf948d9	doc: Update system.clients schema with scheduling_group cell It was added by `9319d65971` (db/virtual_tables: add scheduling group column to system.clients) recently. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#25294	2025-08-05 10:16:20 +03:00
Andrei Chekun	a6a3d119e8	docs: update documentation with new way of running C++ tests Documentation had outdated information how to run C++ test. Additionally, some information added about gathered test metrics. Closes scylladb/scylladb#25180	2025-07-29 14:36:19 +03:00
Pawel Pery	eadbf69d6f	vector_store_client: implement ANN API This patch is a part of vector_store_client sharded service implementation for a communication with vector-store service. It implements a functionality for ANN search request to a vector-store service. It sends request, receive response and after parsing it returns the list of primary keys. It adds json parsing functionality specific for the HTTP ANN API. It adds a hardcoded http request timeout for retrieving response from the Vector Store service. It also adds an automatic boost test of the ANN search interface, which uses a mockup http server in a background to simulate vector-store service. It adds a documentation for HTTP API protocol used used for ANN functionality. Fixes: VS-47	2025-07-09 11:54:51 +02:00
Avi Kivity	dfaed80f55	Merge 'types: add byte-comparable format support for native cql3 types' from Lakshmi Narayanan Sreethar This PR introduces a new `comparable_bytes` class to add byte-comparable format support for all the [native cql3 data types](https://opensource.docs.scylladb.com/stable/cql/types.html#native-types) except `counter` type as that is not comparable. The byte-comparable format is a pre-requisite for implementing the trie based index format for our sstables(https://github.com/scylladb/scylladb/issues/19191). This implementation adheres to the byte-comparable format specification in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/bytecomparable/ByteComparable.md Note that support for composite data types like lists, maps, and sets has not been implemented yet and will be made available in a separate PR. Refs https://github.com/scylladb/scylladb/issues/19407 New feature - backport not required. Closes scylladb/scylladb#23541 * github.com:scylladb/scylladb: types/comparable_bytes: add testcase to verify compatibility with cassandra types/comparable_bytes: support variable-length natively byte-ordered data types types/comparable_bytes: support decimal cql3 types types/comparable_bytes: introduce count_digits() method types/comparable_bytes: support uuid and timeuuid cql3 types types/comparable_bytes: support varint cql3 type types/comparable_bytes: support skipping sign byte write in decode_signed_long_type types/comparable_bytes: introduce encode/decode_varint_length types/comparable_bytes: support float and double cql3 types types/comparable_bytes: support date, time and timestamp cql3 types types/comparable_bytes: support bigint cql3 type types/comparable_bytes: support fixed length signed integers types/comparable_bytes: support boolean cql3 type types: introduce comparable_bytes class bytes_ostream: overload write() to support writing from FragmentedView docs: fix minor typo in docs/dev/cql3-type-mapping.md	2025-07-02 11:58:32 +03:00
Lakshmi Narayanan Sreethar	068e74b457	docs: fix minor typo in docs/dev/cql3-type-mapping.md Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Michael Litvak	6fa5d2f7c8	docs: topology-over-raft: document co-located tables	2025-07-01 13:20:19 +03:00
Michael Litvak	4777444024	tablets: add base_table column to system.tablets Add a new column base_table to the system.tablets table. It can be set to point to another table to indicate that the tablets of this table are co-located with the tablets of the base table. When it's set, we don't store other tablet information in system.tablets and in the in-memory tablet map object for this table, and we need to refer instead to the base table tablet information. The method get_tablet_map always returns the base tablet map.	2025-07-01 10:29:59 +03:00
Michael Litvak	4e2742a30b	docs: update system.tablets schema The schema of system.tablets in the docs is outdated. replace it with the current schema.	2025-07-01 10:29:59 +03:00
Avi Kivity	b33dd2bd7d	Merge 'sstables/mx/writer: handle non-full prefix row keys' from Botond Dénes Although valid for compact tables, non-full (or empty) clustering key prefixes are not handled for row keys when writing sstables. Only the present components are written, consequently if the key is empty, it is omitted entirely. When parsing sstables, the parsing code unconditionally parses a full prefix. This mis-match results in parsing failures, as the parser parses part of the row content as a key resulting in a garbage key and subsequent mis-parsing of the row content and maybe even subsequent partitions. Introduce a new system table: `system.corrupt_data` and infrastructure similar to `large_data_handler`: `corrupt_data_handler` which abstracts how corrupt data is handled. The sstable writer now passes rows such corrupt keys to the corrupt data handler. This way, we avoid corrupting the sstables beyond parsing and the rows are also kept around in system.corrupt_data for later inspection and possible recovery. Add a full-stack test which checks that rows with bad keys are correctly handled. Fixes: https://github.com/scylladb/scylladb/issues/24489 The bug is present in all versions, has to be backported to all supported versions. Closes scylladb/scylladb#24492 * github.com:scylladb/scylladb: test/boost/sstable_datafile_test: add test for corrupt data sstables/mx/writer: handler rows with empty keys test/lib/cql_assertions: introduce columns_assertions sstables: add corrupt_data_handler to sstables::sstables tools/scylla-sstable: make large_data_handler a local db: introduce corrupt_data_handler mutation: introduce frozen_mutation_fragment_v2 mutation/mutation_partition_view: read_{clustering,static}_row(): return row type mutation/mutation_partition_view: extract de-ser of {clustering,static} row idl-compiler.py: generate skip() definition for enums serializers idl: extract full_position.idl from position_in_partition.idl db/system_keyspace: add apply_mutation() db/system_keyspace: introduce the corrupt_data table	2025-06-29 18:18:36 +03:00
Robert Bindar	6e7cab5b45	Add repository layout dev documentation This change adds an md file which gives a high level overview of the scylladb repository, the components each path contains and a basic description for each one of them. This is mainly intended for onboarding engineers to help get a mental picture when starting ramping up on Scylla concepts. Refs #22908 Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#23010	2025-06-25 13:58:05 +03:00
Botond Dénes	92b5fe8983	db/system_keyspace: introduce the corrupt_data table To serve as a place to store corrupt mutation fragments. These fragments cannot be written to sstables, as they would be spread around by compaction and/or repair. They even might make parsing the sstable impossible. So they are stored in this special table instead, kept around to be inspected later and possibly restored if possible.	2025-06-24 11:05:30 +03:00
Patryk Jędrzejczak	6489308ebc	Merge 'Introduce a queue of global topology requests.' from Gleb Natapov Currently only one global topology request (such as truncate, cdc repair, cleanup and alter table) can be pending. If one is already pending others will be rejected with an error. This is not very user friendly, so this series introduces a queue of global requests which allows queuing many global topology requests simultaneously. Fixes: #16822 No need to backport since this is a new feature. Closes scylladb/scylladb#24293 * https://github.com/scylladb/scylladb: topology coordinator: simplify truncate handling in case request queue feature is disable topology coordinator: fix indentation after the previous patch topology coordinator: allow running multiple global commands in parallel topology coordinator: Implement global topology request queue topology coordinator: Do not cancel global requests in cancel_all_requests topology coordinator: store request type for each global command topology request: make it possible to hold global request types in request_type field topology coordinator: move alter table global request parameters into topology_request table topology coordinator: move cleanup global command to report completion through topology_request table topology coordinator: no need to create updates vector explicitly topology coordinator: use topology_request_tracking_mutation_builder::done() instead of open code it topology coordinator: handle error during new_cdc_generation command processing topology coordinator: remove unneeded semicolon topology coordinator: fix indentation after the last commit topology coordinator: move new_cdc_generation topology request to use topology_request table for completion gms/feature_service: add TOPOLOGY_GLOBAL_REQUEST_QUEUE feature flag	2025-06-23 16:08:09 +03:00
Robert Bindar	1dd37ba47a	Add dev documentation for manipulating s3 data manually This patch intends to give an overview of where, when and how we store data in S3 and provide a quick set of commands which help gain local access to the data in case there is a need for manual intervention. The patch also collects in the same place links/descriptions for all formats we use in S3. Fixes #22438 Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#24323	2025-06-17 13:21:30 +03:00
Gleb Natapov	a0a3a034e0	topology coordinator: Implement global topology request queue Requests, together with their parameters, are added to the topology_request tables and the queue of active global requests is kept in topology state. Thy are processed one by one by the topology state machine. Fixes: #16822	2025-06-11 11:29:33 +03:00
Evgeniy Naydanov	cdc4b520da	test.py: cql: run tests using bare pytest command Create a custom pytest test collector for .cql files and move CQL test execution logic from `CQLApprovalTest` class and `pylib/cql_repl/cql_repl.py` file to `CqlTest.runtest()` method. In result, the only difference between CQLApproval and Python suite types is suffixes of test files.	2025-06-03 07:54:51 +00:00
Andrzej Jackowski	086df24555	transport: implement SCYLLA_USE_METADATA_ID support Metadata id was introduced in CQLv5 to make metadata of prepared statement consistent between driver and database. This commit introduces a protocol extension that allows to use the same mechanism in CQLv4. This change: - Introduce SCYLLA_USE_METADATA_ID protocol extension for CQLv4 - Introduce METADATA_CHANGED flag in RESULT. The flag cames directly from CQLv5 binary protocol. In CQLv4, the bit was never used, so we assume it is safe to reuse it. - Implement handling of metadata_id and METADATA_CHANGED in RESULT rows - Implement returning metadata_id in RESULT prepared - Implement reading metadata_id from EXECUTE - Added description of SCYLLA_USE_METADATA_ID in documentation Metadata_id is wrapped in cql_metadata_id_wrapper because we need to distinguish the following situations: - Metadata_id is not supported by the protocol (e.g. CQLv4 without the extension is used) - Metadata_id is supported by the protocol but not set - e.g. PREPARE query is being handled: it doesn't contain metadata_id in the request but the reply (RESULT prepared) must contain metadata_id - Metadata_id is supported by the protocol and set, any number of bytes >= 0 is allowed, according to the CQLv5 protocol specification Fixes scylladb/scylladb#20860	2025-05-14 09:59:16 +02:00
Andrei Chekun	747f2b1301	docs: add more steps in installation of test.py Documentation for --gather-metric parameter was missing. This functionality can break regular flow of using test.py, because of possible misconfiguration of the cgroup on the local machine. Added explanation how to deal with potential issue of gathering metrics functionality and how to switch it off. Fixes: https://github.com/scylladb/scylladb/issues/20763 Closes scylladb/scylladb#24095	2025-05-13 13:08:18 +03:00
Kefu Chai	3e3f583b84	docs/dev/tombstone.md: fix a typo s/alwas/always/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23734	2025-04-15 10:54:42 +03:00
Avi Kivity	5e1cf90a51	build: replace tools/java submodule with packaged cassandra-stress We no longer use tools/java (scylladb/scylla-tools-java.git) for nodetool or cqlsh; only cassandra-stress. Since that is available in package form install that and excise the tools/java submodule from the source tree. pgo/ is adjusted to use the packaged cassandra-stress (and the cqlsh submodule). A few jmx references are dropped as well. Frozen toolchain regenerated. Optimized clang from https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-aarch64.tar.gz https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-x86_64.tar.gz Closes scylladb/scylladb#23698	2025-04-15 10:11:28 +03:00
Avi Kivity	9559e53f55	Merge 'Adjust tablet-mon.py for capacity-aware load balancing' from Tomasz Grabiec After load-balancer was made capacity-aware it no longer equalizes tablet count per shard, but rather utilization of shard's storage. This makes the old presentation mode not useful in assessing whether balance was reached, since nodes with less capacity will get fewer tablets when in balanced state. This PR adds a new default presentation mode which scales tablet size by its storage utilization so that tablets which have equal shard utilization take equal space on the graph. To facilitate that, a new virtual table was added: system.load_per_node, which allows the tool to learn about load balancer's view on per-node capacity. It can also serve as a debugging interface to get a view of current balance according to the load-balancer. Closes scylladb/scylladb#23584 * github.com:scylladb/scylladb: tablet-mon.py: Add presentation mode which scales tablet size by its storage utilization tablet-mon.py: Center tablet id text properly in the vertical axis tablet-mon.py: Show migration stage tag in table mode only when migrating virtual-tables: Introduce system.load_per_node virtual_tables: memtable_filling_virtual_table: Propagate permit to execute() docs: virtual-tables: Fix instructions service: tablets: Keep load_stats inside tablet_allocator	2025-04-10 14:59:08 +03:00
Tomasz Grabiec	b5211cca85	Merge 'tablets: rebuild: use repair for tablet rebuild' from Aleksandra Martyniuk Currently, when we rebuild a tablet, we stream data from all replicas. This creates a lot of redundancy, wastes bandwidth and CPU resources. In this series, we split the streaming stage of tablet rebuild into two phases: first we stream tablet's data from only one replica and then repair the tablet. Fixes: https://github.com/scylladb/scylladb/issues/17174. Needs backport to 2025.1 to prevent out of space during streaming Closes scylladb/scylladb#23187 * github.com:scylladb/scylladb: test: add test for rebuild with repair locator: service: move to rebuild_v2 transition if cluster is upgraded locator: service: add transition to rebuild_repair stage for rebuild_v2 locator: service: add rebuild_repair tablet transition stage locator: add maybe_get_primary_replica locator: service: add rebuild_v2 tablet transition kind gms: add REPAIR_BASED_TABLET_REBUILD cluster feature	2025-04-09 21:35:37 +02:00
Tomasz Grabiec	0b9a75d7b6	virtual-tables: Introduce system.load_per_node Can be used to query per-node stats about load as seen by the load balancer. In particular, node's capacity will be used by tablet-mon.py to scale tablet columns so that equal height is equal node utilization.	2025-04-09 20:21:51 +02:00
Tomasz Grabiec	34beaa30b5	docs: virtual-tables: Fix instructions	2025-04-09 20:21:51 +02:00
Botond Dénes	583a813d17	docs/dev/tombstone.md: fix link to ddl.html Closes scylladb/scylladb#23622	2025-04-08 16:18:50 +03:00
Aleksandra Martyniuk	eb17af6143	locator: service: add transition to rebuild_repair stage for rebuild_v2 Modify write_both_read_old and streaming stages in rebuild_v2 transition kind: write_both_read_old moves to rebuild_repair stage and streaming stage streams data only from one replica.	2025-04-08 10:42:02 +02:00
Aleksandra Martyniuk	ed7b8bb787	locator: service: add rebuild_v2 tablet transition kind Currently, in the streaming stage of rebuild tablet transition, we stream tablet data from all replicas. This patch series splits the streaming stage into two phases: - repair phase, where we repair the tablet; - streaming phase, where we stream tablet data from one replica. To differentiate the two streaming methods, a new tablet transition kind - rebuild_v2 - is added. The transtions and stages for rebuild_v2 transition kind will be added in the following patches.	2025-04-08 10:42:01 +02:00
Botond Dénes	3bad46a6e2	docs/dev: add tombstone.md An exhaustive document on the tombstone related internal logic as well as the user-facing aspects. Closes scylladb/scylladb#23454	2025-04-01 20:17:57 +03:00
Pavel Emelyanov	2ee9cec1d3	Merge 'Remove object_storage.yaml and move the endpoints to scylla.yaml' from Robert Bindar Move `object_storage.yaml` endpoints to `scylla.yaml` This change also removes the `object_storage.yaml` file altogether and adds tests for fetching the endpoints via the `v2/config/object_storage_endpoints` REST api. Also, `object_storage_config_file` options is moved to a deprecated state as it's no longer needed. This PR depends on #22951, the reviewers should review patch 393e1ac0ec066475ca94094265a5f88dbbdb1a1f Refs https://github.com/scylladb/scylladb/issues/22428 Closes scylladb/scylladb#22952 * github.com:scylladb/scylladb: Remove db::config::object_storage_config Move `object_storage.yaml` endpoints to `scylla.yaml`	2025-04-01 16:01:44 +03:00
Michał Chojnowski	d33ffb221b	docs/dev: add sstable-compression-dicts.md	2025-04-01 00:07:31 +02:00
Robert Bindar	e3a3508960	Move `object_storage.yaml` endpoints to `scylla.yaml` This change also removes the `object_storage.yaml` file altogether and adds tests for fetching the endpoints via the `v2/config/object_storage_endpoints` REST api. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>	2025-03-31 13:39:39 +03:00
Avi Kivity	2b9e1e61d0	docs: reader_concurrency_semaphore: document CPU concurrency limit Document the CPU concurrency implemented in `3d816b7c16` and adjusted in `3d12451d1f`. Closes scylladb/scylladb#23404	2025-03-31 09:39:55 +03:00
Aleksandra Martyniuk	4c75701756	docs: locator: update the docs and formatter of tablet_task_info	2025-02-14 09:13:11 +01:00
TripleChecker	8d64be94e2	Fix typos	2025-02-13 01:54:08 +02:00
TripleChecker	e72e6fadeb	Fix typos	2025-02-11 00:17:43 +02:00
Pavel Emelyanov	81f7a6d97d	doc: Update system.sstables table schema description The partition key had been renamed and its type changed some time ago, but the doc wasn't updated. Fix it. refs: #20998 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#22683	2025-02-10 16:09:49 +02:00
Botond Dénes	51a273401c	Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec This PR converts boost load balancer tests in preparation for load balancer changes which add per-table tablet hints. After those changes, load balancer consults with the replication strategy in the database, so we need to create proper schema in the database. To do that, we need proper topology for replication strategies which use RF > 1, otherwise keyspace creation will fail. Topology is created in tests via group0 commands, which is abstracted by the new `topology_builder` class. Tests cannot modify token_metadata only in memory now as it needs to be consistent with the schema and on-disk metadata. That's why modifications to tablet metadata are now made under group0 guard and save back metadata to disk. Closes scylladb/scylladb#22648 * github.com:scylladb/scylladb: test: tablets: Drop keyspace after do_test_load_balancing_merge_colocation() scenario tests: tablets: Set initial tablets to 1 to exit growing mode test: tablets_test: Create proper schema in load balancer tests test: lib: Introduce topology_builder test: cql_test_env: Expose topology_state_machine topology_state_machine: Introduce lock transition	2025-02-10 16:08:41 +02:00
Avi Kivity	9712390336	Merge 'Add per-table tablet options in schema' from Benny Halevy This series extends the table schema with per-table tablet options. The options are used as hints for initial tablet allocation on table creation and later for resize (split or merge) decisions, when the table size changes. * New feature, no backport required Closes scylladb/scylladb#22090 * github.com:scylladb/scylladb: tablets: resize_decision: get rid of initial_decision tablet_allocator: consider tablet options for resize decision tablet_allocator: load_balancer: table_size_desc: keep target_tablet_size as member network_topology_strategy: allocate_tablets_for_new_table: consider tablet options network_topology_strategy: calculate_initial_tablets_from_topology: precalculate shards per dc using for_each_token_owner network_topology_strategy: calculate_initial_tablets_from_topology: set default rf to 0 cql3: data_dictionary: format keyspace_metadata: print "enabled":true when initial_tablets=0 cql3/create_keyspace_statement: add deprecation warning for initial tablets test: cqlpy: test_tablets: add tests for per-table tablet options schema: add per-table tablet options feature_service: add TABLET_OPTIONS cluster schema feature	2025-02-08 20:32:19 +02:00
Tomasz Grabiec	61532eb53b	topology_state_machine: Introduce lock transition Will be used in load balancer tests to prevent concurrent topology operations, in particular background load balancing. load balancer will be invoked explicitly by the test. Disabling load balancer in topology is not a solution, because we want the explicit call to perform the load balancing.	2025-02-07 16:09:21 +01:00
Avi Kivity	861fb58e14	Merge 'vector: add support for vector type' from Dawid Pawlik This pull request is an implementation of vector data type similar to one used by Apache Cassandra. The patch contains: - implementation of vector_type_impl class - necessary functionalities similar to other data types - support for serialization and deserialization of vectors - support for Lua and JSON format - valid CQL syntax for `vector<>` type - `type_parser` support for vectors - expression adjustments such as: - add `collection_constructor::style_type::vector` - rename `collection_constructor::style_type::list` to `collection_constructor::style_type::list_or_vector` - vector type encoding (for drivers) - unit tests - cassandra compatibility tests - necessary documentation Co-authored-by: @janpiotrlakomy Fixes https://github.com/scylladb/scylladb/issues/19455 Closes scylladb/scylladb#22488 * github.com:scylladb/scylladb: docs: add vector type documentation cassandra_tests: translate tests covering the vector type type_codec: add vector type encoding boost/expr_test: add vector expression tests expression: adjust collection constructor list style expression: add vector style type test/boost: add vector type cql_env boost tests test/boost: add vector type_parser tests type_parser: support vector type cql3: add vector type syntax types: implement vector_type_impl	2025-02-06 20:36:50 +02:00
Benny Halevy	c5668d99c9	schema: add per-table tablet options Unlike with vnodes, each tablet is served only by a single shard, and it is associated with a memtable that, when flushed, it creates sstables which token-range is confined to the tablet owning them. On one hand, this allows for far better agility and elasticity since migration of tablets between nodes or shards does not require rewriting most if not all of the sstables, as required with vnodes (at the cleanup phase). Having too few tablets might limit performance due not being served by all shards or by imbalance between shards caused by quantization. The number of tabelts per table has to be a power of 2 with the current design, and when divided by the number of shards, some shards will serve N tablets, while others may serve N+1, and when N is small N+1/N may be significantly larger than 1. For example, with N=1, some shards will serve 2 tablet replicas and some will serve only 1, causing an imbalance of 100%. Now, simply allocating a lot more tablets for each table may theoretically address this problem, but practically: a. Each tablet has memory overhead and having too many tablets in the system with many tables and many tablets for each of them may overwhelm the system's and cause out-of-memory errors. b. Too-small tablets cause a proliferation of small sstables that are less efficient to acces, have higher metadata overhead (due to per-sstable overhead), and might exhaust the system's open file-descriptors limitations. The options introduced in this change can help the user tune the system in two ways: 1. Sizing the table to prevent unnecessary tablet splits and migrations. This can be done when the table is created, or later on, using ALTER TABLE. 2. Controlling min_per_shard_tablet_count to improve tablet balancing, for hot tables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:55:51 +02:00
Ernest Zaslavsky	29e60288de	docs: update the `object_storage.md` and `admin.rst` Added additional options and best practices for AWS authentication.	2025-02-05 14:57:19 +02:00
Dawid Pawlik	a68bf6dcc1	docs: add vector type documentation Add missing vector type documentation including: definition of vector, adjustment of term definition, JSON encoding, Lua and cql3 type mapping, vector dimension limit, and keyword specification.	2025-01-28 21:14:49 +01:00
Aleksandra Martyniuk	18cc79176a	api: task_manager: do not unregister tasks on get_status Currently, /task_manager/task_status_recursive/{task_id} and /task_manager/task_status/{task_id} unregister queries task if it has already finished. The status should not disappear after being queried. Do not unregister finished task when its status or recursive status is queried.	2025-01-27 11:23:45 +01:00
Aleksandra Martyniuk	e37d1bcb98	api: task_manager: add /task_manager/drain In the following patches, get_status won't be unregistering finished tasks. However, tests need a functionality to drop a task, so that they could manipulate only with the tasks for operations that were invoked by these tests. Add /task_manager/drain/{module} to unregister all finished tasks from the module. Add respective nodetool command.	2025-01-27 11:23:45 +01:00
Paweł Zakrzewski	702e727e33	audit: Add documentation for the audit subsystem Adds detailed documentation covering the new audit subsystem: - Add new audit.md design document explaining: - Core concepts and design decisions - CQL extensions for audit management - Implementation details and trigger evaluation - Prior art references from other databases - Add user-facing documentation: - New auditing.rst guide with configuration and usage details - Integration with security documentation index - Updates to cluster management procedures - Updates to security checklist The documentation covers all aspects of the audit system including: - Configuration options and storage backends (syslog/table) - Audit categories (DCL/DDL/AUTH/DML/QUERY/ADMIN) - Permission model and security considerations - Failure handling and logging - Example configurations and output formats This ensures users have complete guidance for setting up and using the new audit capabilities.	2025-01-15 11:10:35 +01:00
Kefu Chai	f8885a4afd	dist/docker,docs: replace "--experimental" with "--experimental-features" The "--experimental" option was removed in commit `f6cca741ea`. Using this deprecated option now causes Scylla to fail with the error: ``` error: the argument ('on') for option '--experimental-features' is invalid ``` So, in this change, let's update the docker entry point script to use `--experimental-features` command line option instead. The related document is updated accordingly. Fixes scylladb/scylladb#22207 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22283	2025-01-14 07:56:38 -05:00
Avi Kivity	814942505f	Merge 'Introduce Encryption-at-Rest (EAR) for sstables and commitlog' from Calle Wilund Fixes https://github.com/scylladb/scylla-enterprise/issues/5016#issuecomment-2558464631 EAR - encryption at rest. Allows on-disk file encryption of sstables and commitlog data. Introduces OpenSSL based file level encrypted storage, managed via a set of providers ranging from local files to cloud KMS providers. For a more comprehensive explanation, see the included docs (or if possible, original source tree). Manual bulk merge of EAR feature from enterprise repo to main scylla repo. Breaks some features apart, but main EAR is still a humongous commit, because to separate this I would have to mess with code incrementally, adding time and risk. This PR includes the local file gen tool, tests and also p11 validation. Note: CI will not execute the full tests unless master CI is set to provide the same environment as the enterprise one. Not sure about the status of this ATM. Note: Includes code to compile against cryptsoft kmipc SDK, but not the SDK. If you happen to check out this tree in the scylla folder and configure, it will be linked against and KMIP functionality will be enabled, otherwise not. Closes scylladb/scylladb#22233 * github.com:scylladb/scylladb: docs: Add EAR docs main/build: Add p11-kit and initialize tools: Add local-file-key-generator tool tests: Add EAR tests tmpdir: shorten test tempdir path EAR: port the ear feature from enterprise cql_test_env: Add optional query timeout schema/migration_manager: Add schema validate sstables: add get_shared_components accessor config/config_file: Add exports and definitions of config_type_for<>	2025-01-12 16:10:46 +02:00

1 2 3 4 5 ...

268 Commits