Commit Graph

46305 Commits

Author SHA1 Message Date
Benny Halevy
d04cdce0fc view: mutate_MV: calculate keyspace-dependent flags once
All view live in the same keyspace as their base
table, so calculate the keyspace-dependent flags
once, outside the per-view update loop.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-01-22 09:04:24 +02:00
Nadav Har'El
a8805c4fc1 Merge 'cql3, test, utils: switch from boost::adaptors::uniqued to utils::views:unique ' from Kefu Chai
In order to reduce the dependency on external libraries, and for better integration with ranges in C++ standard library. let's use the homebrew `utils::views::unique()` before unique is accepted by the C++ standard.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#22393

* github.com:scylladb/scylladb:
  cql3, test: switch from boost::adaptors::uniqued to utils::views:unique
  utils: implement drop-in replacement for replacing boost::adaptors::uniqued
2025-01-21 19:06:21 +02:00
Sergey Zolotukhin
38caabe3ef test: Fix inconsistent naming of the log files.
The log file names created in `scylla_cluster.py` by
`ScyllaClusterManager`
and files to be collected in conftest.py by `manager` should be in
sync. This patch fixes the issue, originally introduced in
scylladb/scylladb#22192

Fixes scylladb/scylladb#22387

Backports: 6.1 and 6.2.

Closes scylladb/scylladb#22415
2025-01-21 10:45:17 +02:00
Kefu Chai
ccb7b4e606 cql3, test: switch from boost::adaptors::uniqued to utils::views:unique
In order to reduce the dependency on external libraries, and for better
integration with ranges in C++ standard library. let's use the homebrew
`utils::views::unique()` before unique is accepted by the C++ standard.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-21 16:24:45 +08:00
Kefu Chai
d5d251da9a utils: implement drop-in replacement for replacing boost::adaptors::uniqued
Add a custom implementation of boost::adaptors::uniqued that is compatible
with C++20 ranges library. This bridges the gap between Boost.Range and
the C++ standard library ranges until std::views::unique becomes available
in C++26. Currently, the unique view is included in
[P2214](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2760r0.html)
"A Plan for C++ Ranges Evolution", which targets C++26.

The implementation provides:
- A lazy view adaptor that presents unique consecutive elements
- No modification of source range
- Compatibility with C++20 range views and concepts
- Lighter header dependencies compared to Boost

This resolves compilation errors when piping C++20 range views to
boost::adaptors::uniqued, which fails due to concept requirements
mismatch. For example:
```c++
    auto range = std::views::take(n) | boost::adaptors::uniqued; // fails
```
This change also offers us a lightweight solution in terms of smaller
header dependency.

While std::ranges::unique exists in C++23, it's an eager algorithm that
modifies the source range in-place, unlike boost::adaptors::uniqued which
is a lazy view. The proposed std::views::unique (P2214) targeting C++26
would provide this functionality, but is not yet available.

This implementation serves as an interim solution for filtering consecutive
duplicate elements using range views until std::views::unique is
standardized.

For more details on the differences between `std::ranges::unique` and
`boost::adaptors::uniqued`:

- boost::adaptors::uniqued is a view adaptor that creates a lazy view over the original range. It:
  * Doesn't modify the source range
  * Returns a view that presents unique consecutive elements
  * Is non-destructive and lazy-evaluated
  * Can be composed with other views
- std::ranges::unique is an algorithm that:
  * Modifies the source range in-place
  * Removes consecutive duplicates by shifting elements
  * Returns an iterator to the new logical end
  * Cannot be used as a view or composed with other range adaptors

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-21 16:24:45 +08:00
Tomasz Grabiec
8059090a29 Merge 'Cache base info for view schemas in the schema registry' from Wojciech Mitros
Currently, when we load a frozen schema into the registry, we lose
the base info if the schema was of a view. Because of that, in various
places we need to set the base info again, and in some codepaths we
may miss it completely, which may make us unable to process some
requests (for example, when executing reverse queries on views).
Even after setting the base info, we may still lose it if the schema
entry gets deactivated due to all `schema_ptr`s temporarily dying.

To fix this, this patch adds the base schema to the registry, alongside
the view schema. We store just the frozen base schema, so that we can
transfer it across shards. With the base schema, we can now set the base
info when returning the schema from the registry. As a result, we can now
assume that all view schemas returned by the registry have base_info set.

In this series we also make sure that the view schemas in the registry are
kept up-to-date in regards to base schema changes.

Fixes https://github.com/scylladb/scylladb/issues/21354

This issue is a bug, so adding backport labels 6.1 and 6.2

Closes scylladb/scylladb#21862

* github.com:scylladb/scylladb:
  test: add test for schema registry maintaining base info for views
  schema_registry: avoid setting base info when getting the schema from registry
  schema_registry: update cached base schemas when updating a view
  schema_registry: cache base schemas for views
  db: set base info before adding schema to registry
2025-01-21 00:17:54 +01:00
Nadav Har'El
3e16b80014 Merge 'Reject create table with compact storage' from Benny Halevy
As discussed in
https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813,
compact storage tables are deprecated.

Yet, there's is nothing in the code that prevents users
from creating such tables.

This patch adds a live-updateable config option:
`enable_create_table_with_compact_storage`, set to
`false` by default, that require users to opt-in
in order to create new tables WITH COMPACT STORAGE.

Refs scylladb/scylladb#12263, scylladb/scylladb#16375

* Since this guardrail is an enhancement, no backport is needed

Closes scylladb/scylladb#16403

* github.com:scylladb/scylladb:
  docs: ddl: document the deprecation of compact tables
  test: enable_create_table_with_compact_storage for tests that need it
  config: add enable_create_table_with_compact_storage
2025-01-20 22:02:02 +02:00
Tomasz Grabiec
c7f78edc78 Merge 'repair: Wire repair_time in system.tablets for tombstone gc' from Asias He
The repair_time in system.tablets will be updated when repair runs
successfully. We can now use it to update the repair time for tombstone
gc, i.e, when the system.tablets.repair_time is propagated, call
gc_state.update_repair_time() on the node that is the owner of the
tablet.

Since b3b3e880d3 ("repair: Reduce hints and batchlog flush"), the
repair time that could be used for tombstone gc might be smaller than
when the repair is started, so the actual repair time for tombstone gc
is returned by the repair rpc call from the repair master node.

Fixes #17507

New feature. No backport is needed.

Closes scylladb/scylladb#21896

* github.com:scylladb/scylladb:
  repair: Stop using rpc to update repair time for repairs scheduled by scheduler
  repair: Wire repair_time in system.tablets for tombstone gc
  test: Disable flush_cache_time for two tablet repair tests
  test: Introduce guarantee_repair_time_next_second helper
  repair: Return repair time for repair_service::repair_tablet
  service: Add tablet_operation.hh
2025-01-20 18:08:49 +01:00
Benny Halevy
88ae067ddb everywhere: add skeletal support for the in_memory_tables feature
Forward-ported from scylla-enterprise.
Note that the feature has been deprecated and the implementation
is provided only for backward compatibility with pre-existing
features and schema.

Tested manually after adding the following to feature_service:
```
    gms::feature workload_prioritization { *this, "WORKLOAD_PRIORITIZATION"sv };
```

Launched a single-node cluster running 2023.1.10
```
cqlsh> create KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
cqlsh> create TABLE ks.test ( pk int PRIMARY KEY, val int ) WITH compaction = {'class': 'InMemoryCompactionStrategy'};
```

log:
```
Scylla version 2023.1.10-0.20241227.21cffccc1ccd with build-id bd65b8399cb13b713a87e57fe333cfcabfd50be7 starting ...
...
INFO  2024-12-27 19:45:16,563 [shard 0] migration_manager - Create new ColumnFamily: org.apache.cassandra.config.CFMetaData@0x600000f1b400[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName=ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,readRepairChance=0,dcLocalReadRepairChance=0,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,keyValidator=org.apache.cassandra.db.marshal.Int32Type,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,in_memory=false,version=5529c631-c47a-11ef-bd1d-4295734ce5a8,droppedColumns={},collections={},indices={}]
INFO  2024-12-27 19:45:16,564 [shard 0] schema_tables - Creating ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440
```

Upgraded to this branch and started scylla.
Verified that ks.test was successfuly loaded:

log:
```
INFO  2024-12-27 19:48:58,115 [shard 0:main] init - Scylla version 6.3.0~dev-0.20241227.a64c6dfc153e with build-id f9496134a09cf2e55d3865b9e9ff499f672aa7da starting ...
...
WARN  2024-12-27 19:53:02,948 [shard 1:main] CompactionStrategy - InMemoryCompactionStrategy is no longer supported. Defaulting to NullCompactionStrategy.
...
INFO  2024-12-27 19:53:02,948 [shard 0:main] database - Keyspace ks: Reading CF test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440 storage=/home/bhalevy/scylladb/data/ks/test-5529c630c47a11efbd1d4295734ce5a8
```

Then, tested:
```
cqlsh> describe KEYSPACE ks;

CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false};

CREATE TABLE ks.test (
    pk int,
    val int,
    PRIMARY KEY (pk)
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'InMemoryCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE';

cqlsh> alter TABLE ks.test with compaction = {'class': 'SizeTieredCompactionStrategy'};
cqlsh> describe KEYSPACE ks;

CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false};

CREATE TABLE ks.test (
    pk int,
    val int,
    PRIMARY KEY (pk)
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE'
    AND tombstone_gc = {'mode': 'timeout', 'propagation_delay_in_seconds': '3600'};
```

log:
```
INFO  2024-12-27 19:56:40,465 [shard 0:stmt] migration_manager - Update table 'ks.test' From org.apache.cassandra.config.CFMetaData@0x60000362d800[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ec88d510-6aff-344a-914d-541d37081440,droppedColumns={},collections={},indices={}] To org.apache.cassandra.config.CFMetaData@0x60000336e000[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ecccf010-c47b-11ef-b52c-622f2f0e87c4,droppedColumns={},collections={},indices={}]
INFO  2024-12-27 19:56:40,466 [shard 0: gms] schema_tables - Altering ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ecccf010-c47b-11ef-b52c-622f2f0e87c4
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#22068
2025-01-20 16:55:17 +02:00
Kefu Chai
9d6ec45730 build: support wasm32-wasip1 target in configure.py
Update configure.py to use wasm32-wasip1 as an alternative to wasm32-wasi,
matching the behavior previously implemented for CMake builds in 8d7786cb0e.
This ensures consistent WASI target handling across both build systems.

Refs #20878

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22386
2025-01-20 16:43:22 +02:00
Asias He
8208688178 Introduce file stream for tablet
File based stream is a new feature that optimizes tablet movement
significantly. It streams the entire SSTable files without deserializing
SSTable files into mutation fragments and re-serializing them back into
SSTables on receiving nodes. As a result, less data is streamed over the
network, and less CPU is consumed, especially for data models that
contain small cells.

The following patches are imported from the scylla enterprise:

*) Merge 'Introduce file stream for tablet' from Asias He

    This patch uses Seastar RPC stream interface to stream sstable files on
    network for tablet migration.

    It streams sstables instead of mutation fragments. The file based
    stream has multiple advantages over the mutation streaming.

    - No serialization or deserialization for mutation fragments
    - No need to read and process each mutation fragments
    - On wire data is more compact and smaller

    In the test below, a significant speed up is observed.

    Two nodes, 1 shard per node, 1 initial_tablets:

    - Start node 1
    - Insert 10M rows of data with c-s
    - Bootstrap node 2

    Node 1 will migration data to node2 with the file stream.

    Test results:

    1) File stream: bytes on wire = 1132006250 bytes, bw = 836MB/s

    [shard 0:stre] stream_blob - stream_sstables[eadaa8e0-a4f2-4cc6-bf10-39ad1ce106b0]
	Finished sending sstable_nr=2 files_nr=18 files={} range=(-1,9223372036854775807] bytes_sent=1132006250 stream_bw=836MB/s
    [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 1.08004s seconds

    2) Mutation stream: bytes on wire = 3030004736 bytes, bw = 125410.87 KiB/s = 128MB/s

    [shard 0:stre] stream_session - [Stream #406dc8b0-56b5-11ee-bc2d-000bf4871058]
	Streaming plan for Tablet migration-ks1-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=2958989 KiB, 125410.87 KiB/s
    [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 23.5992s seconds

    Test Summary:

    File stream v.s. Mutation stream improvements

    - Stream bandwidth = 836 / 128  (MB/s)  = 6.53X

    - Stream time = 23.60 / 1.08  (Seconds) = 21.85X

    - Stream bytes on wire = 3030004736 / 1132006250 (Bytes)= 2.67X

    Closes scylladb/scylla-enterprise#3438

    * github.com:scylladb/scylla-enterprise:
      tests: Add file_stream_test
      streaming: Implement file stream for tablet

*) streaming: Use new take_storage_snapshot interface

    The new take_storage_snapshot returns a file object instead of a file
    name. This allows the file stream sender to read from the file even if
    the file is deleted by compaction.

    Closes scylladb/scylla-enterprise#3728

*) streaming: Protect unsupported file types for file stream

    Currently, we assume the file streamed over the stream_blob rpc verb is
    a sstable file. This patch rejects the unsupported file types on the
    receiver side. This allows us to stream more file types later using the
    current file stream infrastructure without worrying about old nodes
    processing the new file types in the wrong way.

    - The file_ops::noop is renamed to file_ops::stream_sstables to be
      explicit about the file types

    - A missing test_file_stream_error_injection is added to the idl

    Fixes: #3846
    Tests: test_unsupported_file_ops

    Closes scylladb/scylla-enterprise#3847

*) idl: Add service::session_id id to idl

    It will be used in the next patch.

    Refs #3907

*) streaming: Protect file stream with topology_guard

    Similar to "storage_service, tablets: Use session to guard tablet
    streaming", this patch protects file stream with topology_guard.

    Fixes #3907

*) streaming: Take service topology_guard under the try block

    Taking the service::topology_guard could throw. Currently, it throws
    outside the try block, so the rpc sink will not be closed, causing the
    following assertion:

    ```
    scylla: seastar/include/seastar/rpc/rpc_impl.hh:815: virtual
    seastar::rpc::sink_impl<netw::serializer,
    streaming::stream_blob_cmd_data>::~sink_impl() [Serializer =
    netw::serializer, Out = <streaming::stream_blob_cmd_data>]: Assertion
    `this->_con->get()->sink_closed()' failed.
    ```

    To fix, move more code including the topology_guard taking code to the
    try block.

    Fixes https://github.com/scylladb/scylla-enterprise/issues/4106

    Closes scylladb/scylla-enterprise#4110

*) Merge 'Preserve original SSTable state with file based tablet migration' from Raphael "Raph" Carvalho

    We're not preserving the SSTable state across file based migration, so
    staging SSTables for example are being placed into main directory, and
    consequently, we're mixing staging and non-staging data, losing the
    ability to continue from where the old replica left off.
    It's expected that the view update backlog is transferred from old
    into new replica, as migration doesn't wait for leaving replica to
    complete view update work (which can take long). Elasticity is preferred.

    So this fix guarantees that the state of the SSTable will be preserved
    by propagating it in form of subdirectory (each subdirectory is
    statically mapped with a particular state).

    The staging sstables aren't being registered into view update generator
    yet, as that's supposed to be fixed in OSS (more details can be found
    at https://github.com/scylladb/scylladb/issues/19149).

    Fixes #4265.

    Closes scylladb/scylla-enterprise#4267

    * github.com:scylladb/scylla-enterprise:
      tablet: Preserve original SSTable state with file based tablet migration
      sstables: Add get method for sstable state

*) sstable: (Re-)add shareabled_components getter

*) Merge 'File streaming sstables: Use sstable source/sink to transfer snapshots' from Calle Wilund

    Fixes #4246

    Alternative approach/better separation of concern, transport vs. sstable layer. Builds on #4472, but fancier.

    Ensures we transfer and pre-process scylla metadata for streamed
    file blobs first, then properly apply receiving nodes local config
    by using a source and sink layer exported from sstables, which
    handles things like ordering, metadata filtering (on source) as well
    as handling metadata and proper IO paths when writing data on
    receiver node (sink).

    This implementation maintains the statelessness of the current
    design, and the delegated sink side will re-read and re-write the
    metadata for each component processed. This is a little wasteful,
    but the meta is small, and it is less error prone than trying to do
    caching cross-shards etc. The transport is isolated from the
    knowledge.

    This is an alternative/complement to #4436 and #4472, fixing the
    underlying issue. Note that while the layers/API:s here allows easy
    fixing of other fundamental problems in the feature (such as
    destination location etc), these are not included in the PR, to keep
    it as close to the current behaviour as possible.

    Closes scylladb/scylla-enterprise#4646

    * github.com:scylladb/scylla-enterprise:
      raft_tests: Copy/add a topology test with encryption
      file streaming: Use sstable source/sink to transfer snapshots
      sstables: Add source and sink objects + producers for transfering a snapshot
      sstable::types: Add remove accessor for extension info in metadata

*) The change for error injection in merge commit 966ea5955dd8760:

    File streaming now has "stream_mutation_fragments" error injection points
    so test_table_dropped_during_streaming works with file streaming.

*) doc: document file-based streaming

    This commit adds a description of the file-based streaming feature to the documentation.

    It will be displayed in the docs using the scylladb_include_flag directive after
    https://github.com/scylladb/scylladb/pull/20182 is merged, backported to branch-6.0,
    and, in turn, branch-2024.2.

    Refs https://github.com/scylladb/scylla-enterprise/issues/4585
    Refs https://github.com/scylladb/scylla-enterprise/issues/4254

    Closes scylladb/scylla-enterprise#4587

*) doc: move File-based streaming to the Tablets source file-based-streaming

    This commit moves the description of file-based streaming from a common include file
    to the regular doc source file where tablets are described.

    Closes scylladb/scylla-enterprise#4652

*) streaming: sstable_stream_sink_impl: abort: prevent null pointer dereference

Closes scylladb/scylladb#22034
2025-01-20 16:43:21 +02:00
Yaniv Michael Kaul
7495237a33 Remove noexcept_traits.hh header file
The content of the header file noexcept_traits.hh is unused throughout ScyllaDB's code base.
As part of a greater effort to cleanup Scylla's code and reduce content in the root directory, this header file is simply removed.

This is code cleanup - no need to backport.

Fixes: https://github.com/scylladb/scylladb/issues/22117
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#22139
2025-01-20 16:43:21 +02:00
Nadav Har'El
8caea23d2a test/cqlpy/run: fix regression in "--release" option
The way that the "test/cqlpy/run --release" feature runs older Scylla
releases is that it takes *today*'s command line parameters and "fixes"
it to conform to what old releases took. This approach was easy to
implement (and the resulting "--release" feature is super useful), but
the downside is that we need to update this fixup code whenever we add
new options to the Scylla command line used by test/cqlpy/run.py.

Commit d04f376 made test/cqlpy/run.py use a new option
"--experimental-features=views-with-tablets", so now we need to remove
it when running older versions of Scylla. So this is what we do in this
patch.

Fixes #22349

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#22350
2025-01-20 16:43:21 +02:00
Nadav Har'El
12cbdfa095 test/cqlpy: add regression test for tombstone_gc in "desc table"
The small cqlpy test in this patch is a regression test for issue #14390,
which claimed that the Scylla-only "tombstone_gc" option is missing from
the output of "describe table".

This test shows that this report is *not* true, at least not when the
"server-side describe" is used. "test/cqlpy/run --release ..." shows
that this test passes on master and also for Scylla versions all the
way back to Scylla 5.2 (Scylla 5.1 did not support server-side
describe, so the test fails for that reason).

This suggests that the report in issue #14390 was for old-style
client-side (cqlsh) describe, which we no longer support, so this
issue can be closed.

Fixes #14390.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#22354
2025-01-20 16:43:21 +02:00
Avi Kivity
d2869ecb2b partition_range_compat: drop dependency on boost ranges
Unused anyway.

Closes scylladb/scylladb#22359
2025-01-20 16:43:21 +02:00
Anna Stuchlik
e340d6a452 doc: remove Open Source references in the docs
Fixes https://github.com/scylladb/scylladb/issues/22325

Closes scylladb/scylladb#22377
2025-01-20 16:43:21 +02:00
Botond Dénes
1f20f7810e Merge 'main, encryption: correct misspellings' from Kefu Chai
in this changeset, some misspellings identified by codespell were corrected.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#22301

* github.com:scylladb/scylladb:
  ent/encryption: rename "sie" to "get_opt"
  ent,main: fix misspellings
2025-01-20 16:43:21 +02:00
Benny Halevy
5c77956205 docs: ddl: document the deprecation of compact tables
Add a paragraph documenting the decision to deprecate
the COMPACT STORAGE feature, and instruct the user
how to enable the feature despite that.

Note that we don't have an official migration strategy
for users like `DROP COMPACT STORAGE`, which is not
implemented at this time (See #3882).

Fixes #16375

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-01-20 08:14:39 +02:00
Benny Halevy
f3ab00e61c test: enable_create_table_with_compact_storage for tests that need it
Now enable_create_table_with_compact_storage can be set
to `false` by default in db/config.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-01-20 08:14:37 +02:00
Benny Halevy
0110eb0506 config: add enable_create_table_with_compact_storage
As discussed in
https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813,
compact storage tables are deprecated.

Yet, there's is nothing in the code that prevents users
from creating such tables.

This patch adds a live-updateable config option:
`enable_create_table_with_compact_storage` that require users
to opt-in in order to create new tables WITH COMPACT STORAGE.

The option is currently set to `true` by default in db/config
to reduce the churn to tests and to `false` in scylla.yaml,
for new clusters.

TODO: once regressions tests that use compact storage
are converted to enable the option, change the default in
db/config to false.

A unit test was added to test/cql-pytest that
checks that the respective cql query fails as expected
with the default option or when it is explicitly set to `false`,
and that the query succeeds when the option is set to `true`.

Note that `check_restricted_table_properties` already
returns an optional warning, but it is only logged
but not returned in the `prepared_statement`.
Fixing that is out of the scope of this patch.
See https://github.com/scylladb/scylladb/issues/20945

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-01-20 08:03:25 +02:00
Kefu Chai
1ef2d9d076 tree: migrate from boost::adaptors::transformed to std::views::transform
Replace remaining uses of boost::adaptors::transformed with std::views::transform
to reduce Boost dependencies, following the migration pattern established in
bab12e3a. This change addresses recently merged code that reintroduced Boost
header dependencies through boost::adaptors::transformed usage.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22365
2025-01-17 16:56:40 +02:00
Botond Dénes
47989b1503 Merge 'tasks: add tablet resize virtual task' from Aleksandra Martyniuk
In this change, tablet_virtual_task starts supporting tablet
resize (i.e. split and merge).

Users can see running resize tasks - finished tasks are not
presented with the task manager API.

A new task state "suspended" is added. If a resize was revoked,
it will appear to users as suspended. We assume that the resize was revoked
when the tablet number didn't change.

Fixes: #21366.
Fixes: #21367.

No backport, new feature

Closes scylladb/scylladb#21891

* github.com:scylladb/scylladb:
  test: boost: check resize_task_info in tablet_test.cc
  test: add tests to check revoked resize virtual tasks
  test: add tests to check the list of resize virtual tasks
  test: add tests to check spilt and merge virtual tasks status
  test: test_tablet_tasks: generalize functions
  replica: service: add split virtual task's children
  replica: service: pass parent info down to storage_group::split
  tasks: children of virtual tasks aren't internal by default
  tasks: initialize shard in task_info ctor
  service: extend tablet_virtual_task::abort
  service: retrun status_helper struct from tablet_virtual_task::get_status_helper
  service: extend tablet_virtual_task::wait
  tasks: add suspended task state
  service: extend tablet_virtual_task::get_status
  service: extend tablet_virtual_task::contains
  service: extend tablet_virtual_task::get_stats
  service: add service::task_manager_module::get_nodes
  tasks: add task_manager::get_nodes
  tasks: drop noexcept from module::get_nodes
  replica: service: add resize_task_info static column to system.tablets
  locator: extend tablet_task_info to cover resize tasks
2025-01-17 14:24:07 +02:00
Piotr Dulikowski
6aa962f5f4 Merge 'Add audit subsystem for database operations' from Paweł Zakrzewski
Introduces a comprehensive audit system to track database operations for security
and compliance purposes. This change includes:

Core Components:
- New audit subsystem for logging database operations
- Service level integration for proper resource management
- CQL statement tracking with operation categories
- Login process integration for tenant management

Key Features:
- Configurable audit logging (syslog/table)
- Operation categorization (QUERY/DML/DDL/DCL/AUTH/ADMIN)
- Selective auditing by keyspace/table
- Password sanitization in audit logs
- Service level shares support (1-1000) for workload prioritization
- Proper lifecycle management and cleanup

I ran the dtests for audit (manually enabled) and they pass.
The in-repo tests pass.

Notably, there should be no non-whitespace changes between this and scylla-enterprise

Fixes scylladb/scylla-enterprise#4999

Closes scylladb/scylladb#22147

* github.com:scylladb/scylladb:
  audit: Add shares support to service level management
  audit: Add service level support to CQL login process
  audit: Add support to CQL statements
  audit: Integrate audit subsystem into Scylla main process
  audit: Add documentation for the audit subsystem
  audit: Add the audit subsystem
2025-01-17 13:14:55 +01:00
Kamil Braun
89ee2a6834 Merge 'drop ip addresses from token metadata' from Gleb
Now that all topology related code uses host ids there is not point to
maintain ip to id (and back) mappings in the token metadata. After the
patch the mapping will be maintained in the gossiper only. The rest of
the system will use host ids and in rare cases where translation is
needed (mostly for UX compatibility reasons) the translation will be
done using gossiper.

Fixes: scylladb/scylla#21777

* 'gleb/drop-ip-from-tm-v3' of github.com:scylladb/scylla-dev: (57 commits)
  hint manager: do not translate ip to id in case hint manager is stopped already
  locator: token_metadata: drop update_host_id() function that does nothing now
  locator: topology: drop indexing by ips
  repair: drop unneeded code
  storage_service: use host_id to look for a node in on_alive handler
  storage_proxy: translate ips to ids in forward array using gossiper
  locator: topology: remove unused functions
  storage_service: check for outdated ip in on_change notification in the peers table
  storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology
  topology coordinator: change connection dropping code to work on host ids
  cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query
  locator: drop unused function from tablet_effective_replication_map
  api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead
  locator: token_metadata: remove unused ip based functions
  locator: network_topology_strategy: use host_id based function to check number of endpoints in dcs
  gossiper: drop get_unreachable_token_owners functions
  storage_service: use gossiper to map ip to id in node_ops operations
  storage_service: fix indentation after the last patch
  storage_service: drop loops from node ops replace_prepare handling since there can be only one replacing node
  token_metadata: drop no longer used functions
  ...
2025-01-17 11:00:52 +01:00
Kefu Chai
4a5a00347f utils: do not include unused headers
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22201
2025-01-17 11:24:54 +03:00
Botond Dénes
55963f8f79 replica: remove noexcept from token -> tablet resolution path
The methods to resolve a key/token/range to a table are all noexcept.
Yet the method below all of these, `storage_group_for_id()` can throw.
This means that if due to any mistake a tablet without local replica is
attempted to be looked up, it will result in a crash, as the exception
bubbles up into the noexcept methods.
There is no value in pretending that looking up the tablet replica is
noexcept, remove the noexcept specifiers so that any bad lookup only
fails the operation at hand and doesn't crash the node. This is
especially relevant to replace, which still has a window where writes
can arrive for tablets that don't (yet) have a local replica. Currently,
this results in a crash. After this patch, this will only fail the
writes and the replace can move on.

Fixes: #21480

Closes scylladb/scylladb#22251
2025-01-17 11:24:09 +03:00
Łukasz Paszkowski
adef719c43 api/storage_service: Remove unimplemented truncate API
The API /storage_service/truncate/{ks} returns an unimplemented
error when invoked. As we already have a CQL command,
`TRUNCATE TABLE ks.cf` that causes the table to be truncated on all
nodes, the API can be dropped. Due to the error, it is unused.

Fixes https://github.com/scylladb/scylladb/issues/10520

No backport is required. A small cleanup of not working API.

Closes scylladb/scylladb#22258
2025-01-17 11:21:05 +03:00
Pavel Emelyanov
14c3fbbf8c Merge 'sstable_directory: do not load remote unshared sstables in process_descriptor()' from Lakshmi Narayanan Sreethar
The sstable loader relied on the generation id to provide an efficient
hint about the shard that owns an sstable. But, this hint was rendered
ineffective with the introduction of UUID generation, as the shard id
was no longer embedded in the generation id. This also became suboptimal
with the introduction of tablets. Commit 0c77f77 addressed this issue by
reading the minimum from disk to determine sstable ownership but this
improvement was lost with commit 63f1969, which optimistically assumed
that hints would work most of the time, which isn't true.

This commit restores that change - shard id of a table is deduced by
reading minially from disk and then the sstable is fully loaded only if
it belongs to the local shard. This patch also adds a testcase to verify
that the sstable are loaded only in their respective shards.

Fixes #21015

This fixes a regression and should be backported.

Closes scylladb/scylladb#22263

* github.com:scylladb/scylladb:
  sstable_directory: do not load remote sstables in process_descriptor
  sstable_directory: update `load_sstable()` definition
  sstable_directory: reintroduce `get_shards_for_this_sstable()`
2025-01-17 11:17:54 +03:00
Asias He
387b2050df repair: Stop using rpc to update repair time for repairs scheduled by scheduler
If a tablet repair is scheduled by tablet repair scheduler, the repair
time for tombstone gc will be updated when the system.tablet.repair_time
is updated. Skip updating using rpc calls in this case.
2025-01-17 16:12:05 +08:00
Asias He
53e6025aa6 repair: Wire repair_time in system.tablets for tombstone gc
The repair_time in system.tablets will be updated when repair runs
successfully. We can now use it to update the repair time for tombstone
gc, i.e, when the system.tablets.repair_time is propagated, call
gc_state.update_repair_time() on the node that is the owner of the
tablet.

Since b3b3e880d3 ("repair: Reduce hints and batchlog flush"), the
repair time that could be used for tombstone gc might be smaller than
when the repair is started, so the actual repair time for tombstone gc
is returned by the repair rpc call from the repair master node.

Fixes #17507
2025-01-17 16:12:05 +08:00
Asias He
0b2fef74bc test: Disable flush_cache_time for two tablet repair tests
The cache of the hints and batchlog flush makes the exact repair time
check difficult in the test. Disabling it for two repair tests
that check the exact repair time.
2025-01-17 16:12:05 +08:00
Asias He
23afbd938c test: Introduce guarantee_repair_time_next_second helper
The repair time granularity is seconds. This helper makes sure the
repair time is different than the previous one.
2025-01-17 16:12:05 +08:00
Asias He
41a1eca072 repair: Return repair time for repair_service::repair_tablet
The repair time returned by repair_service::repair_tablet considers the
hints and batchlog flush time, so it could be used for the tombstone gc
purpose.
2025-01-17 16:12:05 +08:00
Asias He
614c3380c6 service: Add tablet_operation.hh
A tablet_operation_result struct is added to track the result of a
tablet operation.
2025-01-17 16:12:05 +08:00
Avi Kivity
d6f7f873d0 utils: config_file: don't use extern fully specialized variable templates
Declaring-but-not-defining a fully specialized template is a great way to
cut dependencies between users and providers, but unfortunately not
supported for variable templates. Clang 18 does support it, but
apparently it is a misinterpretation of the standard, and was removed
in clang 19.

We started using this non-feature in 7ed89266b3.

The fix is to use function templates. This is more verbose as
each specialization needs to define a static variable to return,
but is fully supported.

Closes scylladb/scylladb#22299
2025-01-17 11:06:50 +03:00
Botond Dénes
2428f22d3e Update tools/python3 submodule
* tools/python3 fbf12d02...8415caf4 (1):
  > dist: Support FIPS mode
2025-01-17 09:17:29 +02:00
Tzach Livyatan
a00ab65491 remove BETA from metric and API reference
Closes scylladb/scylladb#22092
2025-01-16 19:25:51 -05:00
Łukasz Paszkowski
aad46bd6f3 reader_concurrency_semaphore: do_wait_admission(): remove dumping diagnostics
The commit b39ca29b3c introduced detection of admission-waiter
anomaly and dumps permit diagnostics as soon as the semaphore did
not admit readers even though it could.

Later on, the commit bf3d0b3543 introduces the optimization where
the admission check is moved to the fiber processing the _read_list.
Since the semaphore no longer admits readers as soon as it can,
dumping diagnostic errors is not necessary as the situation is not
abnormal.

Closes scylladb/scylladb#22344
2025-01-16 19:23:43 -05:00
Nadav Har'El
955ac1b7b7 test/alternator: close boto3 client before shutting down
For several years now, we have seen a strange, and very rare, flakiness
in Alternator tests described in issue #17564: We see all the test pass,
pytest declares them to have passed, and while Python is existing, it
crashes with a signal 11 (SIGSEGV). Because this happens exclusively in
test/alternator and never in the test/cqlpy, we suspect that something
that the test/alternator leaves behind but test/cqlpy does not, causes
some race and crashes during shutdown.

The immediate suspect is the boto3 library, or rather, the urllib3 library
which it uses. This is more-or-less the only thing that test/alternator
does which test/cqlpy doesn't. The urllib3 library keeps around pools of
reusable connections, and it's possible (although I don't actually have any
proof for it) that these open connections may cause a crash during shutdown.

So in this patch I add to the "dynamodb" and "dynamodbstreams" fixtures
(which all Alternator tests use to connect to the server), a teardown which
calls close() for the boto3 client object. This close() call percolates
down to calling clear() on urllib3's PoolManager. Hopefully, this will
make some difference in the chance to crash during shutdown - and if it
doesn't, it won't hurt.

Refs #17564

Closes scylladb/scylladb#22341
2025-01-16 19:21:00 -05:00
Gleb Natapov
a40e810442 hint manager: do not translate ip to id in case hint manager is stopped already
Since we do not stop storage proxy on shutdown this code can be called
during shutdown when address map is no longer usable.
2025-01-16 16:37:08 +02:00
Gleb Natapov
1e4b2f25dc locator: token_metadata: drop update_host_id() function that does nothing now 2025-01-16 16:37:08 +02:00
Gleb Natapov
50fb22c8f9 locator: topology: drop indexing by ips
Do not track id to ip mapping in the topology class any longer. There
are no remaining users.
2025-01-16 16:37:08 +02:00
Gleb Natapov
f9df092fd1 repair: drop unneeded code
There is a code that creates a map from id to ip and then creates a
vector from the keys of the map. Create a vector directly instead.
2025-01-16 16:37:08 +02:00
Gleb Natapov
12da203cae storage_service: use host_id to look for a node in on_alive handler 2025-01-16 16:37:08 +02:00
Gleb Natapov
d45ce6fa12 storage_proxy: translate ips to ids in forward array using gossiper
We already use it to translate reply_to, so do it for consistency and to
drop ip based API usage.
2025-01-16 16:37:08 +02:00
Gleb Natapov
db73758655 locator: topology: remove unused functions 2025-01-16 16:37:07 +02:00
Gleb Natapov
fb28ff5176 storage_service: check for outdated ip in on_change notification in the peers table
The code checks that it does not run for an ip address that is no longer
in use (after ip address change). To check that we can use peers table
and see if the host id is mapped to the address. If yes, this is the
latest address for this host id otherwise this is an outdated entry.
2025-01-16 16:37:07 +02:00
Gleb Natapov
163099678e storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology
We want to drop ip from the locator::node.
2025-01-16 16:37:07 +02:00
Gleb Natapov
49fa1130ef topology coordinator: change connection dropping code to work on host ids
Do not use ip from topology::node, but look it up in address map
instead. We want to drop ip from the topology::node.
2025-01-16 16:37:07 +02:00
Gleb Natapov
83d15b8e32 cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query
We want to drop ip from the topology::node.
2025-01-16 16:37:07 +02:00