scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-24 00:32:15 +00:00

Author	SHA1	Message	Date
Michael Litvak	246635c426	test/test_view_build_status: fix wrong assert in test The test expects and asserts that after wait_for_view is completed we read the view_build_status table and get a row for each node and view. But this is wrong because wait_for_view may have read the table on one node, and then we query the table on a different node that didn't insert all the rows yet, so the assert could fail. To fix it we change the test to retry and check that eventually all expected rows are found and then eventually removed on the same host. Fixes scylladb/scylladb#22547 Closes scylladb/scylladb#22585 (cherry picked from commit `44c06ddfbb`) Closes scylladb/scylladb#22608	2025-02-03 09:24:17 +01:00
Michael Litvak	58eda6670f	view_builder: fix loop in view builder when tokens are moved The view builder builds a view by going over the entire token ring, consuming the base table partitions, and generating view updates for each partition. A view is considered as built when we complete a full cycle of the token ring. Suppose we start to build a view at a token F. We will consume all partitions with tokens starting at F until the maximum token, then go back to the minimum token and consume all partitions until F, and then we detect that we pass F and complete building the view. This happens in the view builder consumer in `check_for_built_views`. The problem is that we check if we pass the first token F with the condition `_step.current_token() >= it->first_token` whenever we consume a new partition or the current_token goes back to the minimum token. But suppose that we don't have any partitions with a token greater than or equal to the first token (this could happen if the partition with token F was moved to another node for example), then this condition will never be satisfied, and we don't detect correctly when we pass F. Instead, we go back to the minimum token, building the same token ranges again, in a possibly infinite loop. To fix this we add another step when reaching the end of the reader's stream. When this happens it means we don't have any more fragments to consume until the end of the range, so we advance the current_token to the end of the range, simulating a partition, and check for built views in that range. Fixes scylladb/scylladb#21829 Closes scylladb/scylladb#22493 (cherry picked from commit `6d34125eb7`) Closes scylladb/scylladb#22607	2025-02-02 22:29:52 +02:00
Nikos Dragazis	d1e8b02260	encrypted_file_test: Test reads beyond decrypted file length Add a test to reproduce a bug in the read DMA API of `encrypted_file_impl` (the file implementation for Encryption-at-Rest). The test creates an encrypted file that contains padding, and then attempts to read from an offset within the padding area. Although this offset is invalid on the decrypted file, the `encrypted_file_impl` makes no checks and proceeds with the decryption of padding data, which eventually leads to bogus results. Refs #22236. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> (cherry picked from commit `8f936b2cbc`) (cherry picked from commit `2fb95e4e2f`)	2025-01-30 09:17:31 +00:00
Calle Wilund	a51888694e	encrypted_file_impl: Check for reads on or past actual file length in transform Fixes #22236 If reading a file and not stopping on block bounds returned by `size()`, we could allow reading from (_file_size+1-15) (block boundary) and try to decrypt this buffer (last one). Check on last block in `transform` would wrap around size due to us being >= file size (l). Simplest example: Actual data size: 4095 Physical file size: 4095 + key block size (typically 16) Read from 4096: -> 15 bytes (padding) -> transform return _file_size - read offset -> wraparound -> rather larger number than we expected (not to mention the data in question is junk/zero). Just do an early bounds check and return zero if we're past the actual data limit. v2: * Moved check to a min expression instead * Added lengthy comment * Added unit test v3: * Fixed read_dma_bulk handling of short, unaligned read * Added test for unaligned read v4: * Added another unaligned test case (cherry picked from commit `e96cc52668`)	2025-01-30 09:17:31 +00:00
Aleksandra Martyniuk	dcf436eb84	test: add test to check if repair handles no_such_keyspace (cherry picked from commit `54e7f2819c`)	2025-01-28 21:50:35 +00:00
Asias He	4018dc7f0d	Introduce file stream for tablet File based stream is a new feature that optimizes tablet movement significantly. It streams the entire SSTable files without deserializing SSTable files into mutation fragments and re-serializing them back into SSTables on receiving nodes. As a result, less data is streamed over the network, and less CPU is consumed, especially for data models that contain small cells. The following patches are imported from the scylla enterprise: ) Merge 'Introduce file stream for tablet' from Asias He This patch uses Seastar RPC stream interface to stream sstable files on network for tablet migration. It streams sstables instead of mutation fragments. The file based stream has multiple advantages over the mutation streaming. - No serialization or deserialization for mutation fragments - No need to read and process each mutation fragments - On wire data is more compact and smaller In the test below, a significant speed up is observed. Two nodes, 1 shard per node, 1 initial_tablets: - Start node 1 - Insert 10M rows of data with c-s - Bootstrap node 2 Node 1 will migration data to node2 with the file stream. Test results: 1) File stream: bytes on wire = 1132006250 bytes, bw = 836MB/s [shard 0:stre] stream_blob - stream_sstables[eadaa8e0-a4f2-4cc6-bf10-39ad1ce106b0] Finished sending sstable_nr=2 files_nr=18 files={} range=(-1,9223372036854775807] bytes_sent=1132006250 stream_bw=836MB/s [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 1.08004s seconds 2) Mutation stream: bytes on wire = 3030004736 bytes, bw = 125410.87 KiB/s = 128MB/s [shard 0:stre] stream_session - [Stream #406dc8b0-56b5-11ee-bc2d-000bf4871058] Streaming plan for Tablet migration-ks1-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=2958989 KiB, 125410.87 KiB/s [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 23.5992s seconds Test Summary: File stream v.s. Mutation stream improvements - Stream bandwidth = 836 / 128 (MB/s) = 6.53X - Stream time = 23.60 / 1.08 (Seconds) = 21.85X - Stream bytes on wire = 3030004736 / 1132006250 (Bytes)= 2.67X Closes scylladb/scylla-enterprise#3438 github.com:scylladb/scylla-enterprise: tests: Add file_stream_test streaming: Implement file stream for tablet ) streaming: Use new take_storage_snapshot interface The new take_storage_snapshot returns a file object instead of a file name. This allows the file stream sender to read from the file even if the file is deleted by compaction. Closes scylladb/scylla-enterprise#3728 ) streaming: Protect unsupported file types for file stream Currently, we assume the file streamed over the stream_blob rpc verb is a sstable file. This patch rejects the unsupported file types on the receiver side. This allows us to stream more file types later using the current file stream infrastructure without worrying about old nodes processing the new file types in the wrong way. - The file_ops::noop is renamed to file_ops::stream_sstables to be explicit about the file types - A missing test_file_stream_error_injection is added to the idl Fixes: #3846 Tests: test_unsupported_file_ops Closes scylladb/scylla-enterprise#3847 ) idl: Add service::session_id id to idl It will be used in the next patch. Refs #3907 ) streaming: Protect file stream with topology_guard Similar to "storage_service, tablets: Use session to guard tablet streaming", this patch protects file stream with topology_guard. Fixes #3907 ) streaming: Take service topology_guard under the try block Taking the service::topology_guard could throw. Currently, it throws outside the try block, so the rpc sink will not be closed, causing the following assertion: ``` scylla: seastar/include/seastar/rpc/rpc_impl.hh:815: virtual seastar::rpc::sink_impl<netw::serializer, streaming::stream_blob_cmd_data>::~sink_impl() [Serializer = netw::serializer, Out = <streaming::stream_blob_cmd_data>]: Assertion `this->_con->get()->sink_closed()' failed. ``` To fix, move more code including the topology_guard taking code to the try block. Fixes https://github.com/scylladb/scylla-enterprise/issues/4106 Closes scylladb/scylla-enterprise#4110 ) Merge 'Preserve original SSTable state with file based tablet migration' from Raphael "Raph" Carvalho We're not preserving the SSTable state across file based migration, so staging SSTables for example are being placed into main directory, and consequently, we're mixing staging and non-staging data, losing the ability to continue from where the old replica left off. It's expected that the view update backlog is transferred from old into new replica, as migration doesn't wait for leaving replica to complete view update work (which can take long). Elasticity is preferred. So this fix guarantees that the state of the SSTable will be preserved by propagating it in form of subdirectory (each subdirectory is statically mapped with a particular state). The staging sstables aren't being registered into view update generator yet, as that's supposed to be fixed in OSS (more details can be found at https://github.com/scylladb/scylladb/issues/19149). Fixes #4265. Closes scylladb/scylla-enterprise#4267 * github.com:scylladb/scylla-enterprise: tablet: Preserve original SSTable state with file based tablet migration sstables: Add get method for sstable state ) sstable: (Re-)add shareabled_components getter ) Merge 'File streaming sstables: Use sstable source/sink to transfer snapshots' from Calle Wilund Fixes #4246 Alternative approach/better separation of concern, transport vs. sstable layer. Builds on #4472, but fancier. Ensures we transfer and pre-process scylla metadata for streamed file blobs first, then properly apply receiving nodes local config by using a source and sink layer exported from sstables, which handles things like ordering, metadata filtering (on source) as well as handling metadata and proper IO paths when writing data on receiver node (sink). This implementation maintains the statelessness of the current design, and the delegated sink side will re-read and re-write the metadata for each component processed. This is a little wasteful, but the meta is small, and it is less error prone than trying to do caching cross-shards etc. The transport is isolated from the knowledge. This is an alternative/complement to #4436 and #4472, fixing the underlying issue. Note that while the layers/API:s here allows easy fixing of other fundamental problems in the feature (such as destination location etc), these are not included in the PR, to keep it as close to the current behaviour as possible. Closes scylladb/scylla-enterprise#4646 * github.com:scylladb/scylla-enterprise: raft_tests: Copy/add a topology test with encryption file streaming: Use sstable source/sink to transfer snapshots sstables: Add source and sink objects + producers for transfering a snapshot sstable::types: Add remove accessor for extension info in metadata ) The change for error injection in merge commit 966ea5955dd8760: File streaming now has "stream_mutation_fragments" error injection points so test_table_dropped_during_streaming works with file streaming. ) doc: document file-based streaming This commit adds a description of the file-based streaming feature to the documentation. It will be displayed in the docs using the scylladb_include_flag directive after https://github.com/scylladb/scylladb/pull/20182 is merged, backported to branch-6.0, and, in turn, branch-2024.2. Refs https://github.com/scylladb/scylla-enterprise/issues/4585 Refs https://github.com/scylladb/scylla-enterprise/issues/4254 Closes scylladb/scylla-enterprise#4587 ) doc: move File-based streaming to the Tablets source file-based-streaming This commit moves the description of file-based streaming from a common include file to the regular doc source file where tablets are described. Closes scylladb/scylla-enterprise#4652 ) streaming: sstable_stream_sink_impl: abort: prevent null pointer dereference Closes scylladb/scylladb#22467	2025-01-26 12:51:59 +02:00
Botond Dénes	e038473887	test/raft/replication.hh: add missing include <fmt/std.h>	2025-01-23 07:29:01 -05:00
Botond Dénes	e60e575cb0	test/boost/bptree_validation.hh: add missing include <fmt/format.h>	2025-01-23 06:05:57 -05:00
Avi Kivity	0092bb5831	Merge 'main: rename `cql_sg_stats` metrics on scheduling group rename' from Piotr Dulikowski This PR contains the missing part of a fix for scylladb/scylla-enterprise#4912 which was omitted during migration of workload prioritization to the source available repository. Even though the regression test for it was ported, it was silently made ineffective by a different fix (scylladb/scylla-enterprise#4764), so this PR also improves the test. Fixes: scylladb/scylladb#22404 No need to backport - service levels are not yet a part of any source-available release. Closes scylladb/scylladb#22416 * github.com:scylladb/scylladb: test/auth_cluster: make test_service_level_metric_name_change useful main: rename `cql_sg_stats` metrics on scheduling group rename	2025-01-22 14:22:09 +02:00
Avi Kivity	59d3a66d18	Revert "Introduce file stream for tablet" This reverts commit `8208688178`. It was contributed from enterprise, but is too different from the original for me to merge back.	2025-01-22 09:42:20 +02:00
Nadav Har'El	a8805c4fc1	Merge 'cql3, test, utils: switch from boost::adaptors::uniqued to utils::views:unique ' from Kefu Chai In order to reduce the dependency on external libraries, and for better integration with ranges in C++ standard library. let's use the homebrew `utils::views::unique()` before unique is accepted by the C++ standard. --- it's a cleanup, hence no need to backport. Closes scylladb/scylladb#22393 * github.com:scylladb/scylladb: cql3, test: switch from boost::adaptors::uniqued to utils::views:unique utils: implement drop-in replacement for replacing boost::adaptors::uniqued	2025-01-21 19:06:21 +02:00
Sergey Zolotukhin	38caabe3ef	test: Fix inconsistent naming of the log files. The log file names created in `scylla_cluster.py` by `ScyllaClusterManager` and files to be collected in conftest.py by `manager` should be in sync. This patch fixes the issue, originally introduced in scylladb/scylladb#22192 Fixes scylladb/scylladb#22387 Backports: 6.1 and 6.2. Closes scylladb/scylladb#22415	2025-01-21 10:45:17 +02:00
Kefu Chai	ccb7b4e606	cql3, test: switch from boost::adaptors::uniqued to utils::views:unique In order to reduce the dependency on external libraries, and for better integration with ranges in C++ standard library. let's use the homebrew `utils::views::unique()` before unique is accepted by the C++ standard. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-21 16:24:45 +08:00
Kefu Chai	d5d251da9a	utils: implement drop-in replacement for replacing boost::adaptors::uniqued Add a custom implementation of boost::adaptors::uniqued that is compatible with C++20 ranges library. This bridges the gap between Boost.Range and the C++ standard library ranges until std::views::unique becomes available in C++26. Currently, the unique view is included in [P2214](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2760r0.html) "A Plan for C++ Ranges Evolution", which targets C++26. The implementation provides: - A lazy view adaptor that presents unique consecutive elements - No modification of source range - Compatibility with C++20 range views and concepts - Lighter header dependencies compared to Boost This resolves compilation errors when piping C++20 range views to boost::adaptors::uniqued, which fails due to concept requirements mismatch. For example: ```c++ auto range = std::views::take(n) \| boost::adaptors::uniqued; // fails ``` This change also offers us a lightweight solution in terms of smaller header dependency. While std::ranges::unique exists in C++23, it's an eager algorithm that modifies the source range in-place, unlike boost::adaptors::uniqued which is a lazy view. The proposed std::views::unique (P2214) targeting C++26 would provide this functionality, but is not yet available. This implementation serves as an interim solution for filtering consecutive duplicate elements using range views until std::views::unique is standardized. For more details on the differences between `std::ranges::unique` and `boost::adaptors::uniqued`: - boost::adaptors::uniqued is a view adaptor that creates a lazy view over the original range. It: * Doesn't modify the source range * Returns a view that presents unique consecutive elements * Is non-destructive and lazy-evaluated * Can be composed with other views - std::ranges::unique is an algorithm that: * Modifies the source range in-place * Removes consecutive duplicates by shifting elements * Returns an iterator to the new logical end * Cannot be used as a view or composed with other range adaptors Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-21 16:24:45 +08:00
Tomasz Grabiec	8059090a29	Merge 'Cache base info for view schemas in the schema registry' from Wojciech Mitros Currently, when we load a frozen schema into the registry, we lose the base info if the schema was of a view. Because of that, in various places we need to set the base info again, and in some codepaths we may miss it completely, which may make us unable to process some requests (for example, when executing reverse queries on views). Even after setting the base info, we may still lose it if the schema entry gets deactivated due to all `schema_ptr`s temporarily dying. To fix this, this patch adds the base schema to the registry, alongside the view schema. We store just the frozen base schema, so that we can transfer it across shards. With the base schema, we can now set the base info when returning the schema from the registry. As a result, we can now assume that all view schemas returned by the registry have base_info set. In this series we also make sure that the view schemas in the registry are kept up-to-date in regards to base schema changes. Fixes https://github.com/scylladb/scylladb/issues/21354 This issue is a bug, so adding backport labels 6.1 and 6.2 Closes scylladb/scylladb#21862 * github.com:scylladb/scylladb: test: add test for schema registry maintaining base info for views schema_registry: avoid setting base info when getting the schema from registry schema_registry: update cached base schemas when updating a view schema_registry: cache base schemas for views db: set base info before adding schema to registry	2025-01-21 00:17:54 +01:00
Nadav Har'El	3e16b80014	Merge 'Reject create table with compact storage' from Benny Halevy As discussed in https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813, compact storage tables are deprecated. Yet, there's is nothing in the code that prevents users from creating such tables. This patch adds a live-updateable config option: `enable_create_table_with_compact_storage`, set to `false` by default, that require users to opt-in in order to create new tables WITH COMPACT STORAGE. Refs scylladb/scylladb#12263, scylladb/scylladb#16375 * Since this guardrail is an enhancement, no backport is needed Closes scylladb/scylladb#16403 * github.com:scylladb/scylladb: docs: ddl: document the deprecation of compact tables test: enable_create_table_with_compact_storage for tests that need it config: add enable_create_table_with_compact_storage	2025-01-20 22:02:02 +02:00
Piotr Dulikowski	780ff17ff5	test/auth_cluster: make test_service_level_metric_name_change useful The test test_service_level_metric_name_change was originally introduced to serve as a regression test for scylladb/scylla-enterprise#4912. Before the fix, some per-scheduling-group metrics would not get adjusted when the scheduling group gets renamed (which does happen for SL-managed scheduling groups) and it would be possible to attempt to register metrics with the same set of labels, resulting in an error. However, in scylladb/scylla-enterprise#4764, another bug was fixed which affected the test. Before a service level is created, a "test" scheduling group can be created by service level controller if it is unsure whether it is allowed to create more scheduling groups or not. If creation of the scheduling group succeeds, it is put into the pool of scheduling groups to be reused when a new service level is created. Therefore, the node handling CREATE SERVICE LEVEL would always use the scheduling group that was originally created for the sake of the test as a SG for the new service level. All of the above is intentional and was actually fixed by the aforementioned issue. However, the test scheduling groups would always get unique names and, therefore, the error would no longer reproduce. However, the faulty logic that ran previously and caused the bug still runs - when a node updates its service levels cache on group0 reload. The test previously used only one node. Fix it by starting two nodes instead of one at the beginning of the test and by serving all service level commands to the first node - were the issue not fixed, the error would get triggered on the second node.	2025-01-20 18:17:15 +01:00
Tomasz Grabiec	c7f78edc78	Merge 'repair: Wire repair_time in system.tablets for tombstone gc' from Asias He The repair_time in system.tablets will be updated when repair runs successfully. We can now use it to update the repair time for tombstone gc, i.e, when the system.tablets.repair_time is propagated, call gc_state.update_repair_time() on the node that is the owner of the tablet. Since `b3b3e880d3` ("repair: Reduce hints and batchlog flush"), the repair time that could be used for tombstone gc might be smaller than when the repair is started, so the actual repair time for tombstone gc is returned by the repair rpc call from the repair master node. Fixes #17507 New feature. No backport is needed. Closes scylladb/scylladb#21896 * github.com:scylladb/scylladb: repair: Stop using rpc to update repair time for repairs scheduled by scheduler repair: Wire repair_time in system.tablets for tombstone gc test: Disable flush_cache_time for two tablet repair tests test: Introduce guarantee_repair_time_next_second helper repair: Return repair time for repair_service::repair_tablet service: Add tablet_operation.hh	2025-01-20 18:08:49 +01:00
Benny Halevy	88ae067ddb	everywhere: add skeletal support for the in_memory_tables feature Forward-ported from scylla-enterprise. Note that the feature has been deprecated and the implementation is provided only for backward compatibility with pre-existing features and schema. Tested manually after adding the following to feature_service: ``` gms::feature workload_prioritization { *this, "WORKLOAD_PRIORITIZATION"sv }; ``` Launched a single-node cluster running 2023.1.10 ``` cqlsh> create KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh> create TABLE ks.test ( pk int PRIMARY KEY, val int ) WITH compaction = {'class': 'InMemoryCompactionStrategy'}; ``` log: ``` Scylla version 2023.1.10-0.20241227.21cffccc1ccd with build-id bd65b8399cb13b713a87e57fe333cfcabfd50be7 starting ... ... INFO 2024-12-27 19:45:16,563 [shard 0] migration_manager - Create new ColumnFamily: org.apache.cassandra.config.CFMetaData@0x600000f1b400[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName=ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,readRepairChance=0,dcLocalReadRepairChance=0,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,keyValidator=org.apache.cassandra.db.marshal.Int32Type,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,in_memory=false,version=5529c631-c47a-11ef-bd1d-4295734ce5a8,droppedColumns={},collections={},indices={}] INFO 2024-12-27 19:45:16,564 [shard 0] schema_tables - Creating ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440 ``` Upgraded to this branch and started scylla. Verified that ks.test was successfuly loaded: log: ``` INFO 2024-12-27 19:48:58,115 [shard 0:main] init - Scylla version 6.3.0~dev-0.20241227.a64c6dfc153e with build-id f9496134a09cf2e55d3865b9e9ff499f672aa7da starting ... ... WARN 2024-12-27 19:53:02,948 [shard 1:main] CompactionStrategy - InMemoryCompactionStrategy is no longer supported. Defaulting to NullCompactionStrategy. ... INFO 2024-12-27 19:53:02,948 [shard 0:main] database - Keyspace ks: Reading CF test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440 storage=/home/bhalevy/scylladb/data/ks/test-5529c630c47a11efbd1d4295734ce5a8 ``` Then, tested: ``` cqlsh> describe KEYSPACE ks; CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false}; CREATE TABLE ks.test ( pk int, val int, PRIMARY KEY (pk) ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} AND comment = '' AND compaction = {'class': 'InMemoryCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND speculative_retry = '99.0PERCENTILE'; cqlsh> alter TABLE ks.test with compaction = {'class': 'SizeTieredCompactionStrategy'}; cqlsh> describe KEYSPACE ks; CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false}; CREATE TABLE ks.test ( pk int, val int, PRIMARY KEY (pk) ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} AND comment = '' AND compaction = {'class': 'SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND speculative_retry = '99.0PERCENTILE' AND tombstone_gc = {'mode': 'timeout', 'propagation_delay_in_seconds': '3600'}; ``` log: ``` INFO 2024-12-27 19:56:40,465 [shard 0:stmt] migration_manager - Update table 'ks.test' From org.apache.cassandra.config.CFMetaData@0x60000362d800[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ec88d510-6aff-344a-914d-541d37081440,droppedColumns={},collections={},indices={}] To org.apache.cassandra.config.CFMetaData@0x60000336e000[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ecccf010-c47b-11ef-b52c-622f2f0e87c4,droppedColumns={},collections={},indices={}] INFO 2024-12-27 19:56:40,466 [shard 0: gms] schema_tables - Altering ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ecccf010-c47b-11ef-b52c-622f2f0e87c4 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#22068	2025-01-20 16:55:17 +02:00
Asias He	8208688178	Introduce file stream for tablet File based stream is a new feature that optimizes tablet movement significantly. It streams the entire SSTable files without deserializing SSTable files into mutation fragments and re-serializing them back into SSTables on receiving nodes. As a result, less data is streamed over the network, and less CPU is consumed, especially for data models that contain small cells. The following patches are imported from the scylla enterprise: ) Merge 'Introduce file stream for tablet' from Asias He This patch uses Seastar RPC stream interface to stream sstable files on network for tablet migration. It streams sstables instead of mutation fragments. The file based stream has multiple advantages over the mutation streaming. - No serialization or deserialization for mutation fragments - No need to read and process each mutation fragments - On wire data is more compact and smaller In the test below, a significant speed up is observed. Two nodes, 1 shard per node, 1 initial_tablets: - Start node 1 - Insert 10M rows of data with c-s - Bootstrap node 2 Node 1 will migration data to node2 with the file stream. Test results: 1) File stream: bytes on wire = 1132006250 bytes, bw = 836MB/s [shard 0:stre] stream_blob - stream_sstables[eadaa8e0-a4f2-4cc6-bf10-39ad1ce106b0] Finished sending sstable_nr=2 files_nr=18 files={} range=(-1,9223372036854775807] bytes_sent=1132006250 stream_bw=836MB/s [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 1.08004s seconds 2) Mutation stream: bytes on wire = 3030004736 bytes, bw = 125410.87 KiB/s = 128MB/s [shard 0:stre] stream_session - [Stream #406dc8b0-56b5-11ee-bc2d-000bf4871058] Streaming plan for Tablet migration-ks1-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=2958989 KiB, 125410.87 KiB/s [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 23.5992s seconds Test Summary: File stream v.s. Mutation stream improvements - Stream bandwidth = 836 / 128 (MB/s) = 6.53X - Stream time = 23.60 / 1.08 (Seconds) = 21.85X - Stream bytes on wire = 3030004736 / 1132006250 (Bytes)= 2.67X Closes scylladb/scylla-enterprise#3438 github.com:scylladb/scylla-enterprise: tests: Add file_stream_test streaming: Implement file stream for tablet ) streaming: Use new take_storage_snapshot interface The new take_storage_snapshot returns a file object instead of a file name. This allows the file stream sender to read from the file even if the file is deleted by compaction. Closes scylladb/scylla-enterprise#3728 ) streaming: Protect unsupported file types for file stream Currently, we assume the file streamed over the stream_blob rpc verb is a sstable file. This patch rejects the unsupported file types on the receiver side. This allows us to stream more file types later using the current file stream infrastructure without worrying about old nodes processing the new file types in the wrong way. - The file_ops::noop is renamed to file_ops::stream_sstables to be explicit about the file types - A missing test_file_stream_error_injection is added to the idl Fixes: #3846 Tests: test_unsupported_file_ops Closes scylladb/scylla-enterprise#3847 ) idl: Add service::session_id id to idl It will be used in the next patch. Refs #3907 ) streaming: Protect file stream with topology_guard Similar to "storage_service, tablets: Use session to guard tablet streaming", this patch protects file stream with topology_guard. Fixes #3907 ) streaming: Take service topology_guard under the try block Taking the service::topology_guard could throw. Currently, it throws outside the try block, so the rpc sink will not be closed, causing the following assertion: ``` scylla: seastar/include/seastar/rpc/rpc_impl.hh:815: virtual seastar::rpc::sink_impl<netw::serializer, streaming::stream_blob_cmd_data>::~sink_impl() [Serializer = netw::serializer, Out = <streaming::stream_blob_cmd_data>]: Assertion `this->_con->get()->sink_closed()' failed. ``` To fix, move more code including the topology_guard taking code to the try block. Fixes https://github.com/scylladb/scylla-enterprise/issues/4106 Closes scylladb/scylla-enterprise#4110 ) Merge 'Preserve original SSTable state with file based tablet migration' from Raphael "Raph" Carvalho We're not preserving the SSTable state across file based migration, so staging SSTables for example are being placed into main directory, and consequently, we're mixing staging and non-staging data, losing the ability to continue from where the old replica left off. It's expected that the view update backlog is transferred from old into new replica, as migration doesn't wait for leaving replica to complete view update work (which can take long). Elasticity is preferred. So this fix guarantees that the state of the SSTable will be preserved by propagating it in form of subdirectory (each subdirectory is statically mapped with a particular state). The staging sstables aren't being registered into view update generator yet, as that's supposed to be fixed in OSS (more details can be found at https://github.com/scylladb/scylladb/issues/19149). Fixes #4265. Closes scylladb/scylla-enterprise#4267 * github.com:scylladb/scylla-enterprise: tablet: Preserve original SSTable state with file based tablet migration sstables: Add get method for sstable state ) sstable: (Re-)add shareabled_components getter ) Merge 'File streaming sstables: Use sstable source/sink to transfer snapshots' from Calle Wilund Fixes #4246 Alternative approach/better separation of concern, transport vs. sstable layer. Builds on #4472, but fancier. Ensures we transfer and pre-process scylla metadata for streamed file blobs first, then properly apply receiving nodes local config by using a source and sink layer exported from sstables, which handles things like ordering, metadata filtering (on source) as well as handling metadata and proper IO paths when writing data on receiver node (sink). This implementation maintains the statelessness of the current design, and the delegated sink side will re-read and re-write the metadata for each component processed. This is a little wasteful, but the meta is small, and it is less error prone than trying to do caching cross-shards etc. The transport is isolated from the knowledge. This is an alternative/complement to #4436 and #4472, fixing the underlying issue. Note that while the layers/API:s here allows easy fixing of other fundamental problems in the feature (such as destination location etc), these are not included in the PR, to keep it as close to the current behaviour as possible. Closes scylladb/scylla-enterprise#4646 * github.com:scylladb/scylla-enterprise: raft_tests: Copy/add a topology test with encryption file streaming: Use sstable source/sink to transfer snapshots sstables: Add source and sink objects + producers for transfering a snapshot sstable::types: Add remove accessor for extension info in metadata ) The change for error injection in merge commit 966ea5955dd8760: File streaming now has "stream_mutation_fragments" error injection points so test_table_dropped_during_streaming works with file streaming. ) doc: document file-based streaming This commit adds a description of the file-based streaming feature to the documentation. It will be displayed in the docs using the scylladb_include_flag directive after https://github.com/scylladb/scylladb/pull/20182 is merged, backported to branch-6.0, and, in turn, branch-2024.2. Refs https://github.com/scylladb/scylla-enterprise/issues/4585 Refs https://github.com/scylladb/scylla-enterprise/issues/4254 Closes scylladb/scylla-enterprise#4587 ) doc: move File-based streaming to the Tablets source file-based-streaming This commit moves the description of file-based streaming from a common include file to the regular doc source file where tablets are described. Closes scylladb/scylla-enterprise#4652 ) streaming: sstable_stream_sink_impl: abort: prevent null pointer dereference Closes scylladb/scylladb#22034	2025-01-20 16:43:21 +02:00
Nadav Har'El	8caea23d2a	test/cqlpy/run: fix regression in "--release" option The way that the "test/cqlpy/run --release" feature runs older Scylla releases is that it takes today's command line parameters and "fixes" it to conform to what old releases took. This approach was easy to implement (and the resulting "--release" feature is super useful), but the downside is that we need to update this fixup code whenever we add new options to the Scylla command line used by test/cqlpy/run.py. Commit `d04f376` made test/cqlpy/run.py use a new option "--experimental-features=views-with-tablets", so now we need to remove it when running older versions of Scylla. So this is what we do in this patch. Fixes #22349 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#22350	2025-01-20 16:43:21 +02:00
Nadav Har'El	12cbdfa095	test/cqlpy: add regression test for tombstone_gc in "desc table" The small cqlpy test in this patch is a regression test for issue #14390, which claimed that the Scylla-only "tombstone_gc" option is missing from the output of "describe table". This test shows that this report is not true, at least not when the "server-side describe" is used. "test/cqlpy/run --release ..." shows that this test passes on master and also for Scylla versions all the way back to Scylla 5.2 (Scylla 5.1 did not support server-side describe, so the test fails for that reason). This suggests that the report in issue #14390 was for old-style client-side (cqlsh) describe, which we no longer support, so this issue can be closed. Fixes #14390. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#22354	2025-01-20 16:43:21 +02:00
Benny Halevy	f3ab00e61c	test: enable_create_table_with_compact_storage for tests that need it Now enable_create_table_with_compact_storage can be set to `false` by default in db/config. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-20 08:14:37 +02:00
Benny Halevy	0110eb0506	config: add enable_create_table_with_compact_storage As discussed in https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813, compact storage tables are deprecated. Yet, there's is nothing in the code that prevents users from creating such tables. This patch adds a live-updateable config option: `enable_create_table_with_compact_storage` that require users to opt-in in order to create new tables WITH COMPACT STORAGE. The option is currently set to `true` by default in db/config to reduce the churn to tests and to `false` in scylla.yaml, for new clusters. TODO: once regressions tests that use compact storage are converted to enable the option, change the default in db/config to false. A unit test was added to test/cql-pytest that checks that the respective cql query fails as expected with the default option or when it is explicitly set to `false`, and that the query succeeds when the option is set to `true`. Note that `check_restricted_table_properties` already returns an optional warning, but it is only logged but not returned in the `prepared_statement`. Fixing that is out of the scope of this patch. See https://github.com/scylladb/scylladb/issues/20945 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-20 08:03:25 +02:00
Kefu Chai	1ef2d9d076	tree: migrate from boost::adaptors::transformed to std::views::transform Replace remaining uses of boost::adaptors::transformed with std::views::transform to reduce Boost dependencies, following the migration pattern established in `bab12e3a`. This change addresses recently merged code that reintroduced Boost header dependencies through boost::adaptors::transformed usage. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22365	2025-01-17 16:56:40 +02:00
Botond Dénes	47989b1503	Merge 'tasks: add tablet resize virtual task' from Aleksandra Martyniuk In this change, tablet_virtual_task starts supporting tablet resize (i.e. split and merge). Users can see running resize tasks - finished tasks are not presented with the task manager API. A new task state "suspended" is added. If a resize was revoked, it will appear to users as suspended. We assume that the resize was revoked when the tablet number didn't change. Fixes: #21366. Fixes: #21367. No backport, new feature Closes scylladb/scylladb#21891 * github.com:scylladb/scylladb: test: boost: check resize_task_info in tablet_test.cc test: add tests to check revoked resize virtual tasks test: add tests to check the list of resize virtual tasks test: add tests to check spilt and merge virtual tasks status test: test_tablet_tasks: generalize functions replica: service: add split virtual task's children replica: service: pass parent info down to storage_group::split tasks: children of virtual tasks aren't internal by default tasks: initialize shard in task_info ctor service: extend tablet_virtual_task::abort service: retrun status_helper struct from tablet_virtual_task::get_status_helper service: extend tablet_virtual_task::wait tasks: add suspended task state service: extend tablet_virtual_task::get_status service: extend tablet_virtual_task::contains service: extend tablet_virtual_task::get_stats service: add service::task_manager_module::get_nodes tasks: add task_manager::get_nodes tasks: drop noexcept from module::get_nodes replica: service: add resize_task_info static column to system.tablets locator: extend tablet_task_info to cover resize tasks	2025-01-17 14:24:07 +02:00
Piotr Dulikowski	6aa962f5f4	Merge 'Add audit subsystem for database operations' from Paweł Zakrzewski Introduces a comprehensive audit system to track database operations for security and compliance purposes. This change includes: Core Components: - New audit subsystem for logging database operations - Service level integration for proper resource management - CQL statement tracking with operation categories - Login process integration for tenant management Key Features: - Configurable audit logging (syslog/table) - Operation categorization (QUERY/DML/DDL/DCL/AUTH/ADMIN) - Selective auditing by keyspace/table - Password sanitization in audit logs - Service level shares support (1-1000) for workload prioritization - Proper lifecycle management and cleanup I ran the dtests for audit (manually enabled) and they pass. The in-repo tests pass. Notably, there should be no non-whitespace changes between this and scylla-enterprise Fixes scylladb/scylla-enterprise#4999 Closes scylladb/scylladb#22147 * github.com:scylladb/scylladb: audit: Add shares support to service level management audit: Add service level support to CQL login process audit: Add support to CQL statements audit: Integrate audit subsystem into Scylla main process audit: Add documentation for the audit subsystem audit: Add the audit subsystem	2025-01-17 13:14:55 +01:00
Kamil Braun	89ee2a6834	Merge 'drop ip addresses from token metadata' from Gleb Now that all topology related code uses host ids there is not point to maintain ip to id (and back) mappings in the token metadata. After the patch the mapping will be maintained in the gossiper only. The rest of the system will use host ids and in rare cases where translation is needed (mostly for UX compatibility reasons) the translation will be done using gossiper. Fixes: scylladb/scylla#21777 * 'gleb/drop-ip-from-tm-v3' of github.com:scylladb/scylla-dev: (57 commits) hint manager: do not translate ip to id in case hint manager is stopped already locator: token_metadata: drop update_host_id() function that does nothing now locator: topology: drop indexing by ips repair: drop unneeded code storage_service: use host_id to look for a node in on_alive handler storage_proxy: translate ips to ids in forward array using gossiper locator: topology: remove unused functions storage_service: check for outdated ip in on_change notification in the peers table storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology topology coordinator: change connection dropping code to work on host ids cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query locator: drop unused function from tablet_effective_replication_map api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead locator: token_metadata: remove unused ip based functions locator: network_topology_strategy: use host_id based function to check number of endpoints in dcs gossiper: drop get_unreachable_token_owners functions storage_service: use gossiper to map ip to id in node_ops operations storage_service: fix indentation after the last patch storage_service: drop loops from node ops replace_prepare handling since there can be only one replacing node token_metadata: drop no longer used functions ...	2025-01-17 11:00:52 +01:00
Pavel Emelyanov	14c3fbbf8c	Merge 'sstable_directory: do not load remote unshared sstables in process_descriptor()' from Lakshmi Narayanan Sreethar The sstable loader relied on the generation id to provide an efficient hint about the shard that owns an sstable. But, this hint was rendered ineffective with the introduction of UUID generation, as the shard id was no longer embedded in the generation id. This also became suboptimal with the introduction of tablets. Commit `0c77f77` addressed this issue by reading the minimum from disk to determine sstable ownership but this improvement was lost with commit `63f1969`, which optimistically assumed that hints would work most of the time, which isn't true. This commit restores that change - shard id of a table is deduced by reading minially from disk and then the sstable is fully loaded only if it belongs to the local shard. This patch also adds a testcase to verify that the sstable are loaded only in their respective shards. Fixes #21015 This fixes a regression and should be backported. Closes scylladb/scylladb#22263 * github.com:scylladb/scylladb: sstable_directory: do not load remote sstables in process_descriptor sstable_directory: update `load_sstable()` definition sstable_directory: reintroduce `get_shards_for_this_sstable()`	2025-01-17 11:17:54 +03:00
Asias He	53e6025aa6	repair: Wire repair_time in system.tablets for tombstone gc The repair_time in system.tablets will be updated when repair runs successfully. We can now use it to update the repair time for tombstone gc, i.e, when the system.tablets.repair_time is propagated, call gc_state.update_repair_time() on the node that is the owner of the tablet. Since `b3b3e880d3` ("repair: Reduce hints and batchlog flush"), the repair time that could be used for tombstone gc might be smaller than when the repair is started, so the actual repair time for tombstone gc is returned by the repair rpc call from the repair master node. Fixes #17507	2025-01-17 16:12:05 +08:00
Asias He	0b2fef74bc	test: Disable flush_cache_time for two tablet repair tests The cache of the hints and batchlog flush makes the exact repair time check difficult in the test. Disabling it for two repair tests that check the exact repair time.	2025-01-17 16:12:05 +08:00
Asias He	23afbd938c	test: Introduce guarantee_repair_time_next_second helper The repair time granularity is seconds. This helper makes sure the repair time is different than the previous one.	2025-01-17 16:12:05 +08:00
Nadav Har'El	955ac1b7b7	test/alternator: close boto3 client before shutting down For several years now, we have seen a strange, and very rare, flakiness in Alternator tests described in issue #17564: We see all the test pass, pytest declares them to have passed, and while Python is existing, it crashes with a signal 11 (SIGSEGV). Because this happens exclusively in test/alternator and never in the test/cqlpy, we suspect that something that the test/alternator leaves behind but test/cqlpy does not, causes some race and crashes during shutdown. The immediate suspect is the boto3 library, or rather, the urllib3 library which it uses. This is more-or-less the only thing that test/alternator does which test/cqlpy doesn't. The urllib3 library keeps around pools of reusable connections, and it's possible (although I don't actually have any proof for it) that these open connections may cause a crash during shutdown. So in this patch I add to the "dynamodb" and "dynamodbstreams" fixtures (which all Alternator tests use to connect to the server), a teardown which calls close() for the boto3 client object. This close() call percolates down to calling clear() on urllib3's PoolManager. Hopefully, this will make some difference in the chance to crash during shutdown - and if it doesn't, it won't hurt. Refs #17564 Closes scylladb/scylladb#22341	2025-01-16 19:21:00 -05:00
Gleb Natapov	1e4b2f25dc	locator: token_metadata: drop update_host_id() function that does nothing now	2025-01-16 16:37:08 +02:00
Gleb Natapov	50fb22c8f9	locator: topology: drop indexing by ips Do not track id to ip mapping in the topology class any longer. There are no remaining users.	2025-01-16 16:37:08 +02:00
Gleb Natapov	97f95f1dbd	locator: token_metadata: remove unused ip based functions	2025-01-16 16:37:07 +02:00
Gleb Natapov	415e8de36e	locator: topology: change get_datacenter_endpoints and get_datacenter_racks to return host ids and amend users	2025-01-16 16:37:06 +02:00
Gleb Natapov	8433947932	locator: topology: remove get_location overload that works on ip and its last users	2025-01-16 16:37:06 +02:00
Gleb Natapov	1b6e1456e5	messaging_service: drop the usage of ip based token_metadata APIs We want to drop ips from token_metadata so move to use host id based counterparts. Messaging service gets a function that maps from ips to id when is starts listening.	2025-01-16 16:37:06 +02:00
Gleb Natapov	542360e825	test: drop inet_address usage from network_topology_strategy_test Move the test to work on host ids. IPs will be dropped eventually.	2025-01-16 16:37:06 +02:00
Kefu Chai	8d7786cb0e	build: cmake: use wasm32-wasip1 as an alternative of wasm32-wasi wasm32-wasi has been removed in Rust 1.84 (Jan 5th, 2025). if one compiles the tree with Rust 1.84 or up, following build failure is expected: ``` [2/305] Building WASM /home/kefu/dev/scylladb/build/wasm/return_input.wasm FAILED: wasm/return_input.wasm /home/kefu/dev/scylladb/build/wasm/return_input.wasm cd /home/kefu/dev/scylladb/test/resource/wasm/rust && /usr/bin/cargo build --target=wasm32-wasi --example=return_input --locked --manifest-path=Cargo.toml --target-dir=/home/kefu/dev/scylladb/build/test/resource/wasm/rust && wasm-opt /home/kefu/dev/scylladb/build/test/resource/wasm/rust/wasm32-wasi//debug/examples/return_input.wasm -Oz -o /home/kefu/dev/scylladb/build/wasm/return_input.wasm && wasm-strip /home/kefu/dev/scylladb/build/wasm/return_input.wasm error: failed to run `rustc` to learn about target-specific information Caused by: process didn't exit successfully: `rustc - --crate-name ___ --print=file-names --target wasm32-wasi --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=split-debuginfo --print=crate-name --print=cfg` (exit status: 1) --- stderr error: Error loading target specification: Could not find specification for target "wasm32-wasi". Run `rustc --print target-list` for a list of built-in targets ``` in order to workaround this issue, let's check for supported target, and use wasm32-wasip1 if wasm32-wasi is not listed as the supported target. Refs #20878 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22320	2025-01-16 16:28:29 +03:00
Botond Dénes	b2a03e03f7	Merge 'raft: Handle non-critical config update errors in when changing voter status.' from Sergey Zolotukhin When a node is bootstrapped and joined a cluster as a non-voter and changes it's role to a voter, errors can occur while committing a new Raft record, for instance, if the Raft leader changes during this time. These errors are not critical and should not cause a node crash, as the action can be retried. Fixes scylladb/scylladb#20814 Backport: This issue occurs frequently and disrupts the CI workflow to some extent. Backports are needed for versions 6.1 and 6.2. Closes scylladb/scylladb#22253 * github.com:scylladb/scylladb: raft: refactor `remove_from_raft_config` to use a timed `modify_config` call. raft: Refactor functions using `modify_config` to use a common wrapper for retrying. raft: Handle non-critical config update errors in when changing status to voter. test: Add test to check that a node does not fail on unknown commit status error when starting up. raft: Add run_op_with_retry in raft_group0.	2025-01-16 11:00:47 +02:00
Paweł Zakrzewski	5b1da31595	audit: Add shares support to service level management Introduces shares-based workload prioritization for service levels, allowing fine-grained control over resource allocation between tenants. Key changes: - Add shares option to service level configuration: - Valid range: 1-1000 shares - Default value: 1000 shares - Enterprise-only feature gated by WORKLOAD_PRIORITIZATION feature flag - Extend CQL interface: - Add shares parameter to CREATE/ALTER SERVICE_LEVEL - Add shares column to system_distributed.service_levels - Add percentage calculation to LIST SERVICE_LEVELS - Add shares to DESCRIBE EFFECTIVE SERVICE_LEVEL output - Add validation: - Enforce shares range (1-1000) - Validate enterprise feature flag - Handle unset/delete markers properly - Update service level statements: - Add shares validation to CREATE/ALTER operations - Preserve shares through default value replacement - Add proper decomposition for shares values in result sets This change enables operators to control relative resource allocation between tenants using proportional share scheduling, while maintaining backward compatibility with existing service level configurations.	2025-01-15 15:01:05 +01:00
Paweł Zakrzewski	28bd699c51	audit: Add service level support to CQL login process This change integrates service level functionality into the CQL authentication and connection handling: - Add scheduling_group_name to client_data to track service level assignments - Extend SASL challenge interface to expose authenticated username - Modify connection processing to support tenant switching: - Add switch_tenant() method to handle scheduling group changes - Add process_until_tenant_switch() to handle request processing boundaries - Implement no_tenant() default executor - Add execute_under_tenant_type for scheduling group management - Update connection lifecycle to properly handle service level changes: - Initialize connections with default scheduling group - Support dynamic scheduling group updates when service levels change - Ensure proper cleanup of scheduling group assignments The changes enable proper scheduling group assignment and management based on authenticated users' service levels, while maintaining backward compatibility for connections without service level assignments.	2025-01-15 11:10:36 +01:00
Paweł Zakrzewski	1810e2e424	audit: Integrate audit subsystem into Scylla main process Adds core integration of the audit subsystem into Scylla's main process flow. Changes include: - Import audit subsystem header - Initialize audit system during server startup using configuration and token metadata - Start audit system after API server initialization with query processor and memory manager - Add proper shutdown sequence for audit system using RAII pattern - Add error handling for audit system initialization failures The audit system is now properly integrated into Scylla's lifecycle, ensuring: - Correct initialization order relative to other subsystems - Proper resource cleanup during shutdown - Graceful error handling for initialization failures	2025-01-15 11:10:36 +01:00
Paweł Zakrzewski	384641194a	audit: Add the audit subsystem This change introduces a new audit subsystem that allows tracking and logging of database operations for security and compliance purposes. Key features include: - Configurable audit logging to either syslog or a dedicated system table (audit.audit_log) - Selective auditing based on: - Operation categories (QUERY, DML, DDL, DCL, AUTH, ADMIN) - Specific keyspaces - Specific tables - New configuration options: - audit: Controls audit destination (none/syslog/table) - audit_categories: Comma-separated list of operation categories to audit - audit_tables: Specific tables to audit - audit_keyspaces: Specific keyspaces to audit - audit_unix_socket_path: Path for syslog socket - audit_syslog_write_buffer_size: Buffer size for syslog writes The audit logs capture details including: - Operation timestamp - Node and client IP addresses - Operation category and query - Username - Success/failure status - Affected keyspace and table names	2025-01-15 11:10:35 +01:00
Piotr Dulikowski	72f28ce81e	Merge 'main, view: Pair view builder drain with its start' from Dawid Mędrek In this PR, we pair draining the view builder with its start. To better understand what was done and why, let's first look at the situation before this commit and the context of it: (a) The following things happened in order: 1. The view builder would be constructed. 2. Right after that, a deferred lambda would be created to stop the view builder during shutdown. 3. group0_service would be started. 4. A deferred lambda stopping group0_service would be created right after that. 5. The view builder would be started. (b) Because the view builder depends on group0_client, it couldn't be started before starting group0_service. On the other hand, other services depend on the view builder, e.g. the stream manager. That makes changing the order of initialization a difficult problem, so we want to avoid doing that unless we're sure it's the right choice. (c) Since the view builder uses group0_client, there was a possibility of running into a segmentation fault issue in the following scenario: 1. A call to `view_builder::mark_view_build_success()` is issued. 2. We stop group0_service. 3. `view_builder::mark_view_build_success()` calls `announce_with_raft()`, which leads to a use-after-free because group0_service has already been destroyed. This very scenario took place in scylladb/scylladb#20772. Initially, we decided to solve the issue by initializing group0_service a bit earlier (scylladb/scylladb@7bad8378c7). Unfortunately, it led to other issues described in scylladb/scylladb#21534, so we revert that patch. These changes are the second attempt to the problem where we want to solve it in a safer manner. The solution we came up with is to pair the start of the view builder with a deferred lambda that deinitializes it by calling `view_builder::drain()`. No other component of the system should be able to use the view builder anymore, so it's safe to do that. Furthermore, that pairing makes the analysis of initialization/deinitialization order much easier. We also solve the aformentioned use-after-free issue because the view builder itself will no longer attempt to use group0_client. Note that we still pair a deferred lambda calling `view_builder::stop()` with the construction of the view builder; that function will also call `view_builder::drain()`. Another notable thing is `view_builder::drain()` may be called earlier by `storage_service::do_drain()`. In other words, these changes cover the situation when Scylla runs into a problem when starting up. Backport: The patch I'm reverting made it to 6.2, so we want to backport this one there too. Fixes scylladb/scylladb#20772 Fixes scylladb/scylladb#21534 Closes scylladb/scylladb#21909 * github.com:scylladb/scylladb: test/topology_custom: Add test for Scylla with disabled view building main, view: Pair view builder drain with its start Revert "main,cql_test_env: start group0_service before view_builder"	2025-01-15 09:50:26 +01:00
Sergey Zolotukhin	8c48f7ad62	raft: Handle non-critical config update errors in when changing status to voter. When a node is bootstrapped and joins a cluster as a non-voter, errors can occur while committing a new Raft record, for instance, if the Raft leader changes during this time. These errors are not critical and should not cause a node crash, as the action can be retried. Fixes scylladb/scylladb#20814	2025-01-15 09:49:15 +01:00
Sergey Zolotukhin	16053a86f0	test: Add test to check that a node does not fail on unknown commit status error when starting up. Test that a node is starting successfully if while joining a cluster and becoming a voter, it receives an unknown commit status error. Test for scylladb/scylladb#20814	2025-01-14 17:12:06 +01:00
Kamil Braun	2eac7a2d61	Merge 'test/pylib: two trivial cleanups' from Kefu Chai - use "foo not in bar" instead of "not foo in bar" - test/pylib: use foo instead of `'{}'.format(foo)` --- it's a cleanup, hence no need to backport. Closes scylladb/scylladb#22066 * github.com:scylladb/scylladb: test/pylib: use `foo` instead of `'{}'.format(foo)` test/pylib: use "foo not in bar" instead of "not foo in bar"	2025-01-14 16:27:44 +01:00

1 2 3 4 5 ...

8158 Commits