Commit Graph

706 Commits

Author SHA1 Message Date
Amnon Heiman
5747af8555 streaming/stream_manager.cc: label metrics with basic_level
The following metrics will be marked with basic_level label:
scylla_node_ops_finished_percentage

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2025-03-03 16:58:39 +02:00
Kefu Chai
da9960db1c tree: Fix polymorphic exception handling by using references
Replace value-based exception catching with reference-based catching to address
GCC warnings about polymorphic type slicing:

```
warning: catching polymorphic type ‘class seastar::rpc::stream_closed’ by value [-Wcatch-value=]
```

When catching polymorphic exceptions by value, the C++ runtime copies the
thrown exception into a new instance of the specified type, slicing the
actual exception and potentially losing important information. This change
ensures all polymorphic exceptions are caught by reference to preserve the
complete exception state.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#23064
2025-02-26 23:15:16 +02:00
Piotr Dulikowski
e4d574fdbb Merge 'Fix view-builder vs (repair and streaming) initialization order' from Pavel Emelyanov
Both, repair and streaming depend on view builder, but since the builder is started too late, both keep sharded<> reference on it and apply `if (view_builder.local_is_initialized())` safety checks.

However, view builder can do its sharded start much earlier, there's currently nothing that prevents it from doing so. This PR moves view builder start up together with some other of its dependencies, and relaxes the way repair and streaming use their view-builder references, in particular -- removes those ugly initialization checks.

refs: scylladb/scylladb#2737

Closes scylladb/scylladb#22676

* github.com:scylladb/scylladb:
  streaming: Relax streaming::make_streamig_consumer() view builder arg
  streaming: Keep non-sharded view_builder dependency reference
  streaming: Remove view_builder.local_is_initialized() checks
  repair: Keep non-sharded view_builder dependency reference
  repair: Remove view_builder.local_is_initialized() checks
  main: Start sharded<view_builder> earlier
  test/cql_env: Move stream manager start lower
2025-02-17 10:03:28 +01:00
Kefu Chai
34517b09a2 alternator,streaming: fix comment typos
Fix misspellings in comments identified by the codespell tool.
fix typos in comment

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22829
2025-02-16 11:34:44 +02:00
Kefu Chai
7ff0d7ba98 tree: Remove unused boost headers
This commit eliminates unused boost header includes from the tree.

Removing these unnecessary includes reduces dependencies on the
external Boost.Adapters library, leading to faster compile times
and a slightly cleaner codebase.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22857
2025-02-15 20:32:22 +02:00
Pavel Emelyanov
2970567b3a streaming: Relax streaming::make_streamig_consumer() view builder arg
Two callers of it -- repair and stream-manager -- both have non-sharded
reference and can just use it as argument. The helper in question gets
sharded<> one by itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-02-14 20:26:56 +03:00
Pavel Emelyanov
1140a875e1 streaming: Keep non-sharded view_builder dependency reference
Continuation of the previous path -- view builder is started early
enough and construction of stream manager can happen with non-sharded
reference on it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-02-14 20:26:56 +03:00
Pavel Emelyanov
3cb9758bd1 streaming: Remove view_builder.local_is_initialized() checks
Now stream_manager starts with sharded<view_builder> started and this
check can be dropped.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-02-14 20:26:56 +03:00
Asias He
6f04de3efd streaming: Fail stream plan on stream_mutation_fragments handler in case of error
The following is observed in pytest:

1) node1, stream master, tried to pull data from node3

2) node3, stream follower, found node1 restarted

3) node3 killed the rpc stream

4) node1 did not get the stream session failure message from node3. This
failure message was supposed to kill the stream plan on node1. That's the
reason node1 failed the stream session much later at "2024-08-19 21:07:45,539".
Note, node3 failed the stream on its side, so it should have sent the stream
session failure message.

```
$ cat node1.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO  2024-08-19 20:24:01,162 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers={127.0.34.3}, master
ERROR 2024-08-19 20:24:01,190 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to handle STREAM_MUTATION_FRAGMENTS (receive and distribute phase) for ks=ks, cf=cf, peer=127.0.34.3: seastar::nested_exception: seastar::rpc::stream_closed (rpc stream was closed by peer) (while cleaning up after seastar::rpc::stream_closed (rpc stream was closed by peer))
WARN  2024-08-19 21:07:45,539 [shard 0:main] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.3}, tx=0 KiB, 0.00 KiB/s, rx=484 KiB, 0.18 KiB/s

$ cat node3.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO  2024-08-19 20:24:01,163 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers=127.0.34.1, slave
INFO  2024-08-19 20:24:01,164 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Start sending ks=ks, cf=cf, estimated_partitions=2560, with new rpc streaming
WARN  2024-08-19 20:24:01,187 [shard 0: gms] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.1}, tx=633 KiB, 26506.81 KiB/s, rx=0 KiB, 0.00 KiB/s
WARN  2024-08-19 20:24:01,188 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] stream_transfer_task: Fail to send to 127.0.34.1:0: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN  2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to send: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN  2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming error occurred, peer=127.0.34.1
```

To be safe in case the stream fail message is not received, node1 could fail
the stream plan as soon as the rpc stream is aborted in the
stream_mutation_fragments handler.

Fixes #20227

Closes scylladb/scylladb#21960
2025-02-10 16:32:12 +01:00
Asias He
4018dc7f0d Introduce file stream for tablet
File based stream is a new feature that optimizes tablet movement
significantly. It streams the entire SSTable files without deserializing
SSTable files into mutation fragments and re-serializing them back into
SSTables on receiving nodes. As a result, less data is streamed over the
network, and less CPU is consumed, especially for data models that
contain small cells.

The following patches are imported from the scylla enterprise:

*) Merge 'Introduce file stream for tablet' from Asias He

    This patch uses Seastar RPC stream interface to stream sstable files on
    network for tablet migration.

    It streams sstables instead of mutation fragments. The file based
    stream has multiple advantages over the mutation streaming.

    - No serialization or deserialization for mutation fragments
    - No need to read and process each mutation fragments
    - On wire data is more compact and smaller

    In the test below, a significant speed up is observed.

    Two nodes, 1 shard per node, 1 initial_tablets:

    - Start node 1
    - Insert 10M rows of data with c-s
    - Bootstrap node 2

    Node 1 will migration data to node2 with the file stream.

    Test results:

    1) File stream: bytes on wire = 1132006250 bytes, bw = 836MB/s

    [shard 0:stre] stream_blob - stream_sstables[eadaa8e0-a4f2-4cc6-bf10-39ad1ce106b0]
	Finished sending sstable_nr=2 files_nr=18 files={} range=(-1,9223372036854775807] bytes_sent=1132006250 stream_bw=836MB/s
    [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 1.08004s seconds

    2) Mutation stream: bytes on wire = 3030004736 bytes, bw = 125410.87 KiB/s = 128MB/s

    [shard 0:stre] stream_session - [Stream #406dc8b0-56b5-11ee-bc2d-000bf4871058]
	Streaming plan for Tablet migration-ks1-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=2958989 KiB, 125410.87 KiB/s
    [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 23.5992s seconds

    Test Summary:

    File stream v.s. Mutation stream improvements

    - Stream bandwidth = 836 / 128  (MB/s)  = 6.53X

    - Stream time = 23.60 / 1.08  (Seconds) = 21.85X

    - Stream bytes on wire = 3030004736 / 1132006250 (Bytes)= 2.67X

    Closes scylladb/scylla-enterprise#3438

    * github.com:scylladb/scylla-enterprise:
      tests: Add file_stream_test
      streaming: Implement file stream for tablet

*) streaming: Use new take_storage_snapshot interface

    The new take_storage_snapshot returns a file object instead of a file
    name. This allows the file stream sender to read from the file even if
    the file is deleted by compaction.

    Closes scylladb/scylla-enterprise#3728

*) streaming: Protect unsupported file types for file stream

    Currently, we assume the file streamed over the stream_blob rpc verb is
    a sstable file. This patch rejects the unsupported file types on the
    receiver side. This allows us to stream more file types later using the
    current file stream infrastructure without worrying about old nodes
    processing the new file types in the wrong way.

    - The file_ops::noop is renamed to file_ops::stream_sstables to be
      explicit about the file types

    - A missing test_file_stream_error_injection is added to the idl

    Fixes: #3846
    Tests: test_unsupported_file_ops

    Closes scylladb/scylla-enterprise#3847

*) idl: Add service::session_id id to idl

    It will be used in the next patch.

    Refs #3907

*) streaming: Protect file stream with topology_guard

    Similar to "storage_service, tablets: Use session to guard tablet
    streaming", this patch protects file stream with topology_guard.

    Fixes #3907

*) streaming: Take service topology_guard under the try block

    Taking the service::topology_guard could throw. Currently, it throws
    outside the try block, so the rpc sink will not be closed, causing the
    following assertion:

    ```
    scylla: seastar/include/seastar/rpc/rpc_impl.hh:815: virtual
    seastar::rpc::sink_impl<netw::serializer,
    streaming::stream_blob_cmd_data>::~sink_impl() [Serializer =
    netw::serializer, Out = <streaming::stream_blob_cmd_data>]: Assertion
    `this->_con->get()->sink_closed()' failed.
    ```

    To fix, move more code including the topology_guard taking code to the
    try block.

    Fixes https://github.com/scylladb/scylla-enterprise/issues/4106

    Closes scylladb/scylla-enterprise#4110

*) Merge 'Preserve original SSTable state with file based tablet migration' from Raphael "Raph" Carvalho

    We're not preserving the SSTable state across file based migration, so
    staging SSTables for example are being placed into main directory, and
    consequently, we're mixing staging and non-staging data, losing the
    ability to continue from where the old replica left off.
    It's expected that the view update backlog is transferred from old
    into new replica, as migration doesn't wait for leaving replica to
    complete view update work (which can take long). Elasticity is preferred.

    So this fix guarantees that the state of the SSTable will be preserved
    by propagating it in form of subdirectory (each subdirectory is
    statically mapped with a particular state).

    The staging sstables aren't being registered into view update generator
    yet, as that's supposed to be fixed in OSS (more details can be found
    at https://github.com/scylladb/scylladb/issues/19149).

    Fixes #4265.

    Closes scylladb/scylla-enterprise#4267

    * github.com:scylladb/scylla-enterprise:
      tablet: Preserve original SSTable state with file based tablet migration
      sstables: Add get method for sstable state

*) sstable: (Re-)add shareabled_components getter

*) Merge 'File streaming sstables: Use sstable source/sink to transfer snapshots' from Calle Wilund

    Fixes #4246

    Alternative approach/better separation of concern, transport vs. sstable layer. Builds on #4472, but fancier.

    Ensures we transfer and pre-process scylla metadata for streamed
    file blobs first, then properly apply receiving nodes local config
    by using a source and sink layer exported from sstables, which
    handles things like ordering, metadata filtering (on source) as well
    as handling metadata and proper IO paths when writing data on
    receiver node (sink).

    This implementation maintains the statelessness of the current
    design, and the delegated sink side will re-read and re-write the
    metadata for each component processed. This is a little wasteful,
    but the meta is small, and it is less error prone than trying to do
    caching cross-shards etc. The transport is isolated from the
    knowledge.

    This is an alternative/complement to #4436 and #4472, fixing the
    underlying issue. Note that while the layers/API:s here allows easy
    fixing of other fundamental problems in the feature (such as
    destination location etc), these are not included in the PR, to keep
    it as close to the current behaviour as possible.

    Closes scylladb/scylla-enterprise#4646

    * github.com:scylladb/scylla-enterprise:
      raft_tests: Copy/add a topology test with encryption
      file streaming: Use sstable source/sink to transfer snapshots
      sstables: Add source and sink objects + producers for transfering a snapshot
      sstable::types: Add remove accessor for extension info in metadata

*) The change for error injection in merge commit 966ea5955dd8760:

    File streaming now has "stream_mutation_fragments" error injection points
    so test_table_dropped_during_streaming works with file streaming.

*) doc: document file-based streaming

    This commit adds a description of the file-based streaming feature to the documentation.

    It will be displayed in the docs using the scylladb_include_flag directive after
    https://github.com/scylladb/scylladb/pull/20182 is merged, backported to branch-6.0,
    and, in turn, branch-2024.2.

    Refs https://github.com/scylladb/scylla-enterprise/issues/4585
    Refs https://github.com/scylladb/scylla-enterprise/issues/4254

    Closes scylladb/scylla-enterprise#4587

*) doc: move File-based streaming to the Tablets source file-based-streaming

    This commit moves the description of file-based streaming from a common include file
    to the regular doc source file where tablets are described.

    Closes scylladb/scylla-enterprise#4652

*) streaming: sstable_stream_sink_impl: abort: prevent null pointer dereference

Closes scylladb/scylladb#22467
2025-01-26 12:51:59 +02:00
Avi Kivity
59d3a66d18 Revert "Introduce file stream for tablet"
This reverts commit 8208688178. It was
contributed from enterprise, but is too different from the original
for me to merge back.
2025-01-22 09:42:20 +02:00
Asias He
8208688178 Introduce file stream for tablet
File based stream is a new feature that optimizes tablet movement
significantly. It streams the entire SSTable files without deserializing
SSTable files into mutation fragments and re-serializing them back into
SSTables on receiving nodes. As a result, less data is streamed over the
network, and less CPU is consumed, especially for data models that
contain small cells.

The following patches are imported from the scylla enterprise:

*) Merge 'Introduce file stream for tablet' from Asias He

    This patch uses Seastar RPC stream interface to stream sstable files on
    network for tablet migration.

    It streams sstables instead of mutation fragments. The file based
    stream has multiple advantages over the mutation streaming.

    - No serialization or deserialization for mutation fragments
    - No need to read and process each mutation fragments
    - On wire data is more compact and smaller

    In the test below, a significant speed up is observed.

    Two nodes, 1 shard per node, 1 initial_tablets:

    - Start node 1
    - Insert 10M rows of data with c-s
    - Bootstrap node 2

    Node 1 will migration data to node2 with the file stream.

    Test results:

    1) File stream: bytes on wire = 1132006250 bytes, bw = 836MB/s

    [shard 0:stre] stream_blob - stream_sstables[eadaa8e0-a4f2-4cc6-bf10-39ad1ce106b0]
	Finished sending sstable_nr=2 files_nr=18 files={} range=(-1,9223372036854775807] bytes_sent=1132006250 stream_bw=836MB/s
    [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 1.08004s seconds

    2) Mutation stream: bytes on wire = 3030004736 bytes, bw = 125410.87 KiB/s = 128MB/s

    [shard 0:stre] stream_session - [Stream #406dc8b0-56b5-11ee-bc2d-000bf4871058]
	Streaming plan for Tablet migration-ks1-index-0 succeeded, peers={127.0.0.1}, tx=0 KiB, 0.00 KiB/s, rx=2958989 KiB, 125410.87 KiB/s
    [shard 0:stre] storage_service - Streaming for tablet migration of a4f68900-568a-11ee-b7b9-c2b13945eed2:1 took 23.5992s seconds

    Test Summary:

    File stream v.s. Mutation stream improvements

    - Stream bandwidth = 836 / 128  (MB/s)  = 6.53X

    - Stream time = 23.60 / 1.08  (Seconds) = 21.85X

    - Stream bytes on wire = 3030004736 / 1132006250 (Bytes)= 2.67X

    Closes scylladb/scylla-enterprise#3438

    * github.com:scylladb/scylla-enterprise:
      tests: Add file_stream_test
      streaming: Implement file stream for tablet

*) streaming: Use new take_storage_snapshot interface

    The new take_storage_snapshot returns a file object instead of a file
    name. This allows the file stream sender to read from the file even if
    the file is deleted by compaction.

    Closes scylladb/scylla-enterprise#3728

*) streaming: Protect unsupported file types for file stream

    Currently, we assume the file streamed over the stream_blob rpc verb is
    a sstable file. This patch rejects the unsupported file types on the
    receiver side. This allows us to stream more file types later using the
    current file stream infrastructure without worrying about old nodes
    processing the new file types in the wrong way.

    - The file_ops::noop is renamed to file_ops::stream_sstables to be
      explicit about the file types

    - A missing test_file_stream_error_injection is added to the idl

    Fixes: #3846
    Tests: test_unsupported_file_ops

    Closes scylladb/scylla-enterprise#3847

*) idl: Add service::session_id id to idl

    It will be used in the next patch.

    Refs #3907

*) streaming: Protect file stream with topology_guard

    Similar to "storage_service, tablets: Use session to guard tablet
    streaming", this patch protects file stream with topology_guard.

    Fixes #3907

*) streaming: Take service topology_guard under the try block

    Taking the service::topology_guard could throw. Currently, it throws
    outside the try block, so the rpc sink will not be closed, causing the
    following assertion:

    ```
    scylla: seastar/include/seastar/rpc/rpc_impl.hh:815: virtual
    seastar::rpc::sink_impl<netw::serializer,
    streaming::stream_blob_cmd_data>::~sink_impl() [Serializer =
    netw::serializer, Out = <streaming::stream_blob_cmd_data>]: Assertion
    `this->_con->get()->sink_closed()' failed.
    ```

    To fix, move more code including the topology_guard taking code to the
    try block.

    Fixes https://github.com/scylladb/scylla-enterprise/issues/4106

    Closes scylladb/scylla-enterprise#4110

*) Merge 'Preserve original SSTable state with file based tablet migration' from Raphael "Raph" Carvalho

    We're not preserving the SSTable state across file based migration, so
    staging SSTables for example are being placed into main directory, and
    consequently, we're mixing staging and non-staging data, losing the
    ability to continue from where the old replica left off.
    It's expected that the view update backlog is transferred from old
    into new replica, as migration doesn't wait for leaving replica to
    complete view update work (which can take long). Elasticity is preferred.

    So this fix guarantees that the state of the SSTable will be preserved
    by propagating it in form of subdirectory (each subdirectory is
    statically mapped with a particular state).

    The staging sstables aren't being registered into view update generator
    yet, as that's supposed to be fixed in OSS (more details can be found
    at https://github.com/scylladb/scylladb/issues/19149).

    Fixes #4265.

    Closes scylladb/scylla-enterprise#4267

    * github.com:scylladb/scylla-enterprise:
      tablet: Preserve original SSTable state with file based tablet migration
      sstables: Add get method for sstable state

*) sstable: (Re-)add shareabled_components getter

*) Merge 'File streaming sstables: Use sstable source/sink to transfer snapshots' from Calle Wilund

    Fixes #4246

    Alternative approach/better separation of concern, transport vs. sstable layer. Builds on #4472, but fancier.

    Ensures we transfer and pre-process scylla metadata for streamed
    file blobs first, then properly apply receiving nodes local config
    by using a source and sink layer exported from sstables, which
    handles things like ordering, metadata filtering (on source) as well
    as handling metadata and proper IO paths when writing data on
    receiver node (sink).

    This implementation maintains the statelessness of the current
    design, and the delegated sink side will re-read and re-write the
    metadata for each component processed. This is a little wasteful,
    but the meta is small, and it is less error prone than trying to do
    caching cross-shards etc. The transport is isolated from the
    knowledge.

    This is an alternative/complement to #4436 and #4472, fixing the
    underlying issue. Note that while the layers/API:s here allows easy
    fixing of other fundamental problems in the feature (such as
    destination location etc), these are not included in the PR, to keep
    it as close to the current behaviour as possible.

    Closes scylladb/scylla-enterprise#4646

    * github.com:scylladb/scylla-enterprise:
      raft_tests: Copy/add a topology test with encryption
      file streaming: Use sstable source/sink to transfer snapshots
      sstables: Add source and sink objects + producers for transfering a snapshot
      sstable::types: Add remove accessor for extension info in metadata

*) The change for error injection in merge commit 966ea5955dd8760:

    File streaming now has "stream_mutation_fragments" error injection points
    so test_table_dropped_during_streaming works with file streaming.

*) doc: document file-based streaming

    This commit adds a description of the file-based streaming feature to the documentation.

    It will be displayed in the docs using the scylladb_include_flag directive after
    https://github.com/scylladb/scylladb/pull/20182 is merged, backported to branch-6.0,
    and, in turn, branch-2024.2.

    Refs https://github.com/scylladb/scylla-enterprise/issues/4585
    Refs https://github.com/scylladb/scylla-enterprise/issues/4254

    Closes scylladb/scylla-enterprise#4587

*) doc: move File-based streaming to the Tablets source file-based-streaming

    This commit moves the description of file-based streaming from a common include file
    to the regular doc source file where tablets are described.

    Closes scylladb/scylla-enterprise#4652

*) streaming: sstable_stream_sink_impl: abort: prevent null pointer dereference

Closes scylladb/scylladb#22034
2025-01-20 16:43:21 +02:00
Botond Dénes
686a997c04 Merge 'Complete implementation of configuring IO bandwidth limits' from Pavel Emelyanov
In Scylla there are two options that control IO bandwidth limit -- the /storage_service/(compaction|stream)_throughput REST API endpoints. The endpoints are partially implemented and have no counterparts in the nodetool.

This set implements the missing bits and adds tests for new functionality.

Closes scylladb/scylladb#21877

* github.com:scylladb/scylladb:
  nodetool: Implement [gs]etstreamthroughput commands
  nodetool: Implement [gs]etcompationthroughput commands
  test: Add validation of how IO-updating endpoints work
  api: Implement /storage_service/(stream|compaction)_throughput endpoints
  api: Disqualify const config reference
  api: Implement /storage_service/stream_throughput endpoint
  api: Move stream throughput set/get endpoints from storage service block
  api: Move set_compaction_throughput_mb_per_sec to config block
  util: Include fmt/ranges.h in config_file.hh
2025-01-14 07:56:38 -05:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Gleb Natapov
41a57ed2e8 streaming: move streaming code to use host ids instead of host ips
The patch is rather large, but it is a straightforward conversion from
one type to another.
2024-12-15 11:31:11 +02:00
Pavel Emelyanov
f3775ba957 api: Implement /storage_service/stream_throughput endpoint
The value can be obtained from the stream_manager

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Gleb Natapov
92c2558a83 streaming: move streaming verbs to IDL 2024-12-11 18:26:50 +02:00
Gleb Natapov
cd9b349886 migration_manager: move to use host ids instead of ips
Users also amended to pass ids instead of ips.
2024-12-02 10:31:12 +02:00
Kefu Chai
59eb2ab119 treewide: s/boost::algorithm::any_of/std::ranges::any_of/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::any_of`.

in this change, we replace `boost::algorithm::any_of` with
`std::ranges::any_of`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-05 14:06:09 +08:00
Kefu Chai
2498e37a2f mutation_writer,streaming: use reader_consumer_v2 type when appropriate
The `reader_consumer_v2` type
(`std::function<future<> (mutation_reader)>`) is defined alongside
`mutation_reader` in `mutation_reader.hh`.

before this change, we sometimes use
`std::function<future<> (mutation_reader)>` directly when defining a
consumer parameter or a consumer variable.

in this change, we improve maintainability by:

- Reducing duplicate function type declarations
- Centralizing the consumer type definition
- Making future signature updates easier to implement

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21369
2024-10-31 07:17:47 +02:00
Botond Dénes
7c75fc599f streaming: stream-session: switch to tracking permit
The stream-session is the receiving end of streaming, it reads the
mutation fragment stream from an RPC stream and writes it onto the disk.
As such, this part does no disk IO and therefore, using a permit with
count resources is superfluous. Furthermore, after
d98708013c, the count resources on this
permit can cause a deadlock on the receiver end, via the
`db::view::check_view_update_path()`, which wants to read the content of
a system table and therefore has to obtain a permit of its own.

Switch to a tracking-only permit, primarily to resolve the deadlock, but
also because admission is not necessary for a read which does no IO.

Refs: scylladb/scylladb#20885 (partial fix, solves only one of the deadlocks)
Fixes: scylladb/scylladb#21264

Closes scylladb/scylladb#21059
2024-10-27 20:01:25 +02:00
Kefu Chai
6ead5a4696 treewide: move log.hh into utils/log.hh
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.

in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-22 06:54:46 +03:00
Benny Halevy
d34878e96c view: check_needs_view_update_path: get token_metadata_ptr
check_needs_view_update_path is async and might yield
so the token_metadata reference passed to it must be kept
alive throughout the call.

Fixes scylladb/scylladb#20979

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#20980
2024-10-09 20:56:21 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Patryk Wrobel
a89e3d10af code-cleanup: add missing header guards
The following command had been executed to get the
list of headers that did not contain '#pragma once':
'grep -rnw . -e "#pragma once" --include *.hh -L'

This change adds missing include guard to headers
that did not contain any guard.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#19626
2024-07-09 18:31:35 +03:00
Avi Kivity
fdc1449392 treewide: rename flat_mutation_reader_v2 to mutation_reader
flat_mutation_reader_v2 was introduced in a pair of commits in 2021:

  e3309322c3 "Clone flat_mutation_reader related classes into v2 variants"
  08b5773c12 "Adapt flat_mutation_reader_v2 to the new version of the API"

as a replacement for flat_mutation_reader, using range_tombstone_change
instead of range_tombstone to represent represent range tombstones. See
those commits for more information.

The transition was incremental; the last use of the original
flat_mutation_reader was removed in 2022 in commit

  026f8cc1e7 "db: Use mutation_partition_v2 in mvcc"

In turn, flat_mutation_reader was introduced in 2017 in commit

  748205ca75 "Introduce flat_mutation_reader"

To transition from a mutation_reader that nested rows within
a partition in a separate stream, to a flat reader that streamed
partitions and rows in the same stream.

Here, we reclaim the original name and rename the awkward
flat_mutation_reader_v2 to mutation_reader.

Note that mutation_fragment_v2 remains since we still use the original
for compatibilty, sometimes.

Some notes about the transition:

 - files were also renamed. In one case (flat_mutation_reader_test.cc), the
   rename target already existed, so we rename to
    mutation_reader_another_test.cc.

 - a namespace 'mutation_reader' with two definitions existed (in
   mutation_reader_fwd.hh). Its contents was folded into the mutation_reader
   class. As a result, a few #includes had to be adjusted.

Closes scylladb/scylladb#19356
2024-06-21 07:12:06 +03:00
Pavel Emelyanov
8906126a2c stream_manager: Remove system_distributed_keyspace and view_update_generator
Now all the code is happy with view_builder and can be shortened

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-23 13:41:56 +03:00
Pavel Emelyanov
ae2dcdc7c2 streaming: Remove system_distributed_keyspace and view_update_generator
Now all the code is happy with view_builder and can be shortened

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-23 13:41:55 +03:00
Pavel Emelyanov
66a8035b64 view: Make register_staging_sstable() a method of view_builder
Callers of it had just checked if an sstable still has some views
building, so the should talk to view-builder to register the sstable
that's now considered to be staging.

Effectively. this is to hide the view-update-generator from other
services and make them communicate with the builder only.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-23 13:41:47 +03:00
Pavel Emelyanov
92ff0d3fc3 view: Make check_view_build_ongoing() helper a method of view_builder
This helper checks if there's an ongoing build of a view, and it's in
fact internal to view-builder, who keeps its status in one of its
system tables.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-23 13:41:47 +03:00
Pavel Emelyanov
57517d5987 streaming: Proparage view_builder& down to make_streaming_consumer()
Continuation of the previous patch. Repair itself doesn't need it, but
streaming consumer does.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-23 13:41:46 +03:00
Pavel Emelyanov
d917b06857 stream_manager: Add view builder dependency
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-23 13:32:28 +03:00
Tomasz Grabiec
fdcaaea91a tablets, streaming: Implement tablet streaming for intra-node migration 2024-05-16 00:28:46 +02:00
Asias He
1ca779d287 streaming: Fix use after move in fire_stream_event
The event is used in a loop.

Found by clang-tidy:

```
streaming/stream_result_future.cc:80:49: warning: 'event' used after it was moved [bugprone-use-after-move]
        listener->handle_stream_event(std::move(event));
                                                ^
streaming/stream_result_future.cc:80:39: note: move occurred here
        listener->handle_stream_event(std::move(event));
                                      ^
streaming/stream_result_future.cc:80:49: note: the use happens in a later loop iteration than the move
        listener->handle_stream_event(std::move(event));
                                                ^
```

Fixes #18332

Closes scylladb/scylladb#18333
2024-04-25 16:48:54 +03:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Pavel Emelyanov
1f44a374b8 error_injection: Overload inject() instead of inject_with_handler()
The inject_with_handler() method accepts a coroutine that can be called
wiht injection_handler. With such function as an argument, there's no
need in distinctive inject_with_handler() name for a method, it can be
overload of all the existing inject()-s

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-03-11 19:30:19 +03:00
Kefu Chai
5687c289f4 repair: add fmt::formatter for streaming::stream_summary
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.

in this change, we define formatters for streaming::stream_summary, and
drop its operator<<

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-03-09 23:43:32 +08:00
Asias He
83a28342ea service: Drop unused table param from session_topology_guard
The table param is not used. Dropping it so it can be used in places
where the table object is not available.

Closes scylladb/scylladb#17628
2024-03-07 09:34:40 +02:00
Botond Dénes
959d33ba39 Merge 'repair: streaming: handle no_such_column_family from remote node' from Aleksandra Martyniuk
RPC calls lose information about the type of returned exception.
Thus, if a table is dropped on receiver node, but it still exists
on a sender node and sender node streams the table's data, then
the whole operation fails.

To prevent that, add a method which synchronizes schema and then
checks, if the exception was caused by table drop. If so,
the exception is swallowed.

Use the method in streaming and repair to continue them when
the table is dropped in the meantime.

Fixes: #17028.
Fixes: #15370.
Fixes: #15598.

Closes scylladb/scylladb#17231

* github.com:scylladb/scylladb:
  repair: handle no_such_column_family from remote node gracefully
  test: test drop table on receiver side during streaming
  streaming: fix indentation
  streaming: handle no_such_column_family from remote node gracefully
  repair: add methods to skip dropped table
2024-02-23 08:25:45 +02:00
Nadav Har'El
b0233c0833 Merge 'interval: rename nonwrapping_interval to interval' from Avi Kivity
Our interval template started life as `range`, and was supported wrapping to follow Cassandra's convention of wrapping around the maximum token.

We later recognized that an interval type should usually be non-wrapping and split it into wrapping_range and nonwrapping_range, with `range` aliasing wrapping_range to preserve compatibility.

Even later, we realized the name was already taken by C++ ranges and so renamed it to `interval`. Given that intervals are usually non-wrapping, the default `interval` type is non-wrapping.

We can now simplify it further, recognizing that everyone assumes that an interval is non-wrapping and so doesn't need the nonwrapping_interval_designation. We just rename nonwrapping_interval to `interval` and remove the type alias.

Closes scylladb/scylladb#17455

* github.com:scylladb/scylladb:
  interval: rename nonwrapping_interval to interval
  interval: rename interval_test to wrapping_interval_test
2024-02-22 14:03:43 +02:00
Kefu Chai
f644ba9cdc streaming: add fmt::formatter for streaming::stream_request
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `streaming::stream_request`,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-22 14:03:59 +08:00
Kefu Chai
618091f6f7 streaming: add fmt::formatter for stream_session_state
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
`streaming::stream_session_state`, and drop its operator<<

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-22 14:03:59 +08:00
Avi Kivity
51df8b9173 interval: rename nonwrapping_interval to interval
Our interval template started life as `range`, and was supported
wrapping to follow Cassandra's convention of wrapping around the
maximum token.

We later recognized that an interval type should usually be non-wrapping
and split it into wrapping_range and nonwrapping_range, with `range`
aliasing wrapping_range to preserve compatibility.

Even later, we realized the name was already taken by C++ ranges and
so renamed it to `interval`. Given that intervals are usually non-wrapping,
the default `interval` type is non-wrapping.

We can now simplify it further, recognizing that everyone assumes
that an interval is non-wrapping and so doesn't need the
nonwrapping_interval_designation. We just rename nonwrapping_interval
to `interval` and remove the type alias.
2024-02-21 19:43:17 +02:00
Avi Kivity
605bf6e221 range.hh: retire
range.hh was deprecated in bd794629f9 (2020) since its names
conflict with the C++ library concept of an iterator range. The name
::range also mapped to the dangerous wrapping_interval rather than
nonwrapping_interval.

Complete the deprecation by removing range.hh and replacing all the
aliases by the names they point to from the interval library. Note
this now exposes uses of wrapping intervals as they are now explicit.

The unit tests are renamed and range.hh is deleted.

Closes scylladb/scylladb#17428
2024-02-21 00:24:25 +02:00
Aleksandra Martyniuk
2ea5d9b623 test: test drop table on receiver side during streaming 2024-02-15 12:06:47 +01:00
Aleksandra Martyniuk
b08f539427 streaming: fix indentation 2024-02-15 12:06:47 +01:00
Aleksandra Martyniuk
219e1eda09 streaming: handle no_such_column_family from remote node gracefully
If no_such_column_family is thrown on remote node, then streaming
operation fails as the type of exception cannot be determined.

Use repair::with_table_drop_silenced in streaming to continue
operation if a table was dropped.
2024-02-15 12:06:47 +01:00
Pavel Emelyanov
7a710425f0 streaming: Open-code on-stack lambda
It just wraps one if, no benefit in keeping it this way

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17250
2024-02-09 20:31:09 +01:00
Kefu Chai
770baa806e streaming: ignore failures when streaming dropped tables
before this change, when performing `stream_transfer_task`, if an
exception is raised, we check if the table being streamed is still
around, if it is missing, we just skip the table as it should be
dropped during streaming, otherwise we consider it a failure, and
report it back to the peer. this behavior was introduced by
953af382.

but we perform the streaming on all shards in parallel, and if any
of the shards fail because of the dropped table, the exception is
thrown. and the current shard is not necessarily the one which
throws the exception. actually, current shard might be still
waiting for a write lock for removing the table from the database's
table metadata. in that case, we consider the streaming RPC call a
failure even if the table is already removed on some shard(s). and
the peer would fail to bootstreap because of streaming failure.

in this change, before catching all exceptions, we handle
`no_such_column_family`, and do not fail the streaming in that case.
please note, we don't touch other tables, so we can just assume that
`no_such_column_family` is thrown only if the table to be transferred
is missing. that's why `assert()` is added.

Fixes #15370
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17160
2024-02-08 14:07:22 +02:00