Commit Graph

43434 Commits

Author SHA1 Message Date
Nadav Har'El
96dff367f8 Merge 'storage_proxy: update view update backlog on correct shard when writing' from Wojciech Mitros
This series is another approach of https://github.com/scylladb/scylladb/pull/18646 and https://github.com/scylladb/scylladb/pull/19181. In this series we only change where the view backlog gets
updated - we do not assure that the view update backlog returned in a response is necessarily the backlog
that increased due to the corresponding write, the returned backlog may be outdated up to 10ms. Because
 this series does not include this change, it's considerably less complex and it doesn't modify the common
write patch, so no particular performance considerations were needed in that context. The issue being fixed
is still the same, the full description can be seen below.

When a replica applies a write on a table which has a materialized view
it generates view updates. These updates take memory which is tracked
by `database::_view_update_concurrency_sem`, separate on each shard.
The fraction of units taken from the semaphore to the semaphore limit
is the shard's view update backlog. Based on these backlogs, we want
to estimate how busy a node is with its view updates work. We do that
by taking the max backlog across all shards.
To avoid excessive cross-shard operations, the node's (max) backlog isn't
calculated each time we need it, but up to 1 time per 10ms (the `_interval`) with an optimization where the backlog of the calculating shard is immediately up-to-date (we don't need cross-shard operations for it):
```
update_backlog node_update_backlog::fetch() {
    auto now = clock::now();
    if (now >= _last_update.load(std::memory_order_relaxed) + _interval) {
        _last_update.store(now, std::memory_order_relaxed);
        auto new_max = boost::accumulate(
                _backlogs,
                update_backlog::no_backlog(),
                [] (const update_backlog& lhs, const per_shard_backlog& rhs) {
                    return std::max(lhs, rhs.load());
                });
        _max.store(new_max, std::memory_order_relaxed);
        return new_max;
    }
    return std::max(fetch_shard(this_shard_id()), _max.load(std::memory_order_relaxed));
}
```
For the same reason, even when we do calculate the new node's backlog,
we don't read from the `_view_update_concurrency_sem`. Instead, for
each shard we also store a update_backlog atomic which we use for
calculation:
```
    struct per_shard_backlog {
        // Multiply by 2 to defeat the prefetcher
        alignas(seastar::cache_line_size * 2) std::atomic<update_backlog> backlog = update_backlog::no_backlog();
        need_publishing need_publishing = need_publishing::no;

        update_backlog load() const {
            return backlog.load(std::memory_order_relaxed);
        }
    };
 std::vector<per_shard_backlog> _backlogs;
```
Due to this distinction, the update_backlog atomic need to be updated
separately, when the `_view_update_concurrency_sem` changes.
This is done by calling `storage_proxy::update_view_update_backlog`, which reads the `_view_update_concurrency_sem` of the shard (in `database::get_view_update_backlog`)
and then calls node`_update_backlog::add` where the read backlog
is stored in the atomic:
```
void storage_proxy::update_view_update_backlog() {
    _max_view_update_backlog.add(get_db().local().get_view_update_backlog());
}
void node_update_backlog::add(update_backlog backlog) {
    _backlogs[this_shard_id()].backlog.store(backlog, std::memory_order_relaxed);
    _backlogs[this_shard_id()].need_publishing = need_publishing::yes;
}
```
For this implementation of calculating the node's view update backlog to work,
we need the atomics to be updated correctly when the semaphores of corresponding
shards change.

The main event where the view update backlog changes is an incoming write
request. That's why when handling the request and preparing a response
we update the backlog calling `storage_proxy::get_view_update_backlog` (also
because we want to read the backlog and send it in the response):
backlog update after local view updates (`storage_proxy::send_to_live_endpoints` in `mutate_begin`)
```
 auto lmutate = [handler_ptr, response_id, this, my_address, timeout] () mutable {
     return handler_ptr->apply_locally(timeout, handler_ptr->get_trace_state())
             .then([response_id, this, my_address, h = std::move(handler_ptr), p = shared_from_this()] {
         // make mutation alive until it is processed locally, otherwise it
         // may disappear if write timeouts before this future is ready
         got_response(response_id, my_address, get_view_update_backlog());
     });
 };
backlog update after remote view updates (storage_proxy::remote::handle_write)

 auto f = co_await coroutine::as_future(send_mutation_done(netw::messaging_service::msg_addr{reply_to, shard}, trace_state_ptr,
         shard, response_id, p->get_view_update_backlog()));
```
Now assume that on a certain node we have a write request received on shard A,
which updates a row on shard B (A!=B). As a result, shard B will generate view
updates and consume units from its `_view_update_concurrency_sem`, but will
not update its atomic in `_backlogs` yet. Because both shards in the example
are on the same node, shard A will perform a local write calling `lmutate` shown
above. In the `lmutate` call, the `apply_locally` will initiate the actual write on
shard B and the `storage_proxy::update_view_update_backlog` will be called back
on shard A. In no place will the backlog atomic on shard B get updated even
though it increased in size due to the view updates generated there.
Currently, what we calculate there doesn't really matter - it's only used for the
MV flow control delays, so currently, in this scenario, we may only overload
a replica causing failed replica writes which will be later retried as hints. However,
when we add MV admission control, the calculated backlog will be the difference
between an accepted and a rejected request.

Fixes: https://github.com/scylladb/scylladb/issues/18542

Without admission control (https://github.com/scylladb/scylladb/pull/18334), this patch doesn't affect much, so I'm marking it as backport/none

Closes scylladb/scylladb#19341

* github.com:scylladb/scylladb:
  test: add test for view backlog not being updated on correct shard
  test: move auxiliary methods for waiting until a view is built to util
  mv: update view update backlog when it increases on correct shard
2024-07-04 11:40:09 +03:00
Kefu Chai
cccec07581 db: use format_as() in favor of fmt::streamed()
since fedora 38 is EOL. and fedora 39 comes with fmt v10.0.0, also,
we've switched to the build image based on fedora 40, which ships
fmt-devel v10.2.1, there is no need to use fmt::streamed() when
the corresponding format_as() as available.

simpler this way.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19594
2024-07-04 11:10:43 +03:00
Kefu Chai
35e7a0b36f test/cql-pytest: use offset-aware API to avoid deprecate warning
to avoid warning like

```
DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC).
```

and to be future-proof, let's use the offset-aware timestamp.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19536
2024-07-04 10:48:00 +03:00
Kefu Chai
03e1fce7aa zstd: include external header with brackets
zstd.h is a header provided by libzstd, so let's include it with
brackets, more consistent this way.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19538
2024-07-04 10:42:29 +03:00
Takuya ASADA
09e22690dc scylla_coredump_setup: enable compress by default when zstd support detected
We disabled coredump compression by default because it was too slow,
but recent versions of systemd-coredump supports faster zstd based compression,
so let's enable compression by default when zstd support detected.

Related scylladb/scylla-machine-image#462

Closes scylladb/scylladb#18854
2024-07-04 10:38:22 +03:00
Botond Dénes
e3e5f8209d Merge 'alternator: fix "/localnodes" to use broadcast_rpc_address' from Nadav Har'El
This short series fixes Alternator's "/localnodes" request to allow a node's external IP address - configured with `broadcast_rpc_address` - to be listed instead of its usual, internal, IP address.

The first patch fixes a bug in gossiper::get_rpc_address(), which the second patch needs to implement the feature. The second patch also contains regression tests.

Fixes #18711.

Closes scylladb/scylladb#18828

* github.com:scylladb/scylladb:
  alternator: fix "/localnodes" to use broadcast_rpc_address
  gossiper: fix get_rpc_address() for this node
2024-07-04 10:37:28 +03:00
Takuya ASADA
65fbf72ed0 scylla-housekeeping: fix exception on parsing version string
Since Python 3.12, version parsing becomes strict, parse_version() does
not accept the version string like '6.1.0~dev'.
To fix this, we need to replace version string from '6.1.0~dev' to
'6.1.0.dev0', which is allowed on Python version scheme.

reference: https://packaging.python.org/en/latest/specifications/version-specifiers/

Fixes #19564

Closes scylladb/scylladb#19572
2024-07-04 10:27:51 +03:00
Avi Kivity
69450780a7 docs: explain tuning for a node that is overcommitted at the hypervisor level
Closes scylladb/scylladb#19589
2024-07-04 10:23:25 +03:00
Pavel Emelyanov
8809b99736 s3/client: Unmark put-object lambdas from mutable
They don't need to modify the captured objects. In fact, they must not
do it in the first place, because the request can be called more than
once and the buffers must not change between those invocations.

For the memory_sink_buffers there must be const method to get the vector
of temporary_buffers themselves.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#19599
2024-07-04 10:07:48 +03:00
Lakshmi Narayanan Sreethar
c80df8504c sstables::maybe_rebuild_filter_from_index: log sstable origin
Log the sstable origin when its bloom filter is being rebuilt. The
origin has to be passed to the method by the caller as it is not
available in the sstable object when the filter is rebuilt.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#19601
2024-07-04 10:01:23 +03:00
Wojciech Mitros
1fdc65279d test: add test for view backlog not being updated on correct shard
This patch adds a test for reproducing issue https://github.com/scylladb/scylladb/issues/18542
The test performs writes on a table with a materialized view and
checks that the view backlog increases. To get the current view
update backlog, a new metric "view_update_backlog" is added to
the `storage_proxy` metrics. The metric differs from the metric
from `database` metric with the same name by taking the backlog
from the max_view_update_backlog which keeps view update backlogs
from all shards which may be a bit outdated, instead of taking
the backlog by checking the view_update_semaphore which the backlog
is based on directly.
2024-07-03 23:18:52 +02:00
Wojciech Mitros
c4f5659c11 test: move auxiliary methods for waiting until a view is built to util
In many materialized view tests we need to wait until a view is built before
actually working on it, future tests will also need it. In existing tests
we use the same, duplicated method for achieving that.
In this patch the method is deduplicated and moved to pylib/util.py
and existing tests are modified to use it instead.
2024-07-03 23:18:52 +02:00
Wojciech Mitros
fd9c7d4d59 mv: update view update backlog when it increases on correct shard
When performing a write, we should update the view update backlog
on the shard where the mutation is actually applied. Instead,
currently we only update it on the shard that initially received
the write request (which didn't change at all) and as a result,
the backlog on the correct shard and the aggregated max view update
backlog are not updated at all.
This patch enables updating the backlog on the correct shard. The
update is now performed just after the view generation and propagation
finishes, so that all backlog increases are noted and the backlog is
ready to be used in the write response.
Additionally, after this patch, we no longer (falsely) assume that
the backlog is modified on the same shard as where we later read it
to attach to a response. However, we still compare the aggregated
backlog from all shards and the backlog from the shard retrieving
the max, as with a shard-aware driver, it's likely the exact shard
whose backlog changed.
2024-07-03 23:18:52 +02:00
Avi Kivity
3fc4e23a36 forward_service: rename to mapreduce_service
forward_service is nondescriptive and misnamed, as it does more than
forward requests. It's a classic map/reduce algorithm (and in fact one
of its parameters is "reducer"), so name it accordingly.

The name "forward" leaked into the wire protocol for the messaging
service RPC isolation cookie, so it's kept there. It's also maintained
in the name of the logger (for "nodetool setlogginglevel") for
compatibility with tests.

Closes scylladb/scylladb#19444
2024-07-03 19:29:47 +03:00
Avi Kivity
f798217293 Merge 'build: cmake: include the whole archive of zstd.a' from Kefu Chai
before this change, when linking scylla-main, the linker discards
the unreferenced symbols defined by zstd.cc. but we use constructor
of static variable `registerator` to register the zstd compressor,
this variable is not used from the linker's point of view. but we
do rely on the side effect of its constructor.
that's why the rules generated by CMake fails to build tests and
scylla executables with zstd support. that's why we have following
test failure:
```
boost.sstable_3_x_test.test_uncompressed_collections_read
...
[Exception] - no_such_class: unable to find class 'org.apache.cassandra.io.compress.ZstdCompressor'
 == [File] - seastar/src/testing/seastar_test.cc
 == [Line] - 43
```

in this change, we single out zstd.cc and build it as an archive,
so that scylla-main can include as a whole. an alternative is to
link scylla-main as a whole archive, but that might increase the disk
foot print when building lots of tests -- some of them do not use all
symbols exposed by scylla-main, and can potentially have smaller
size if linker can discard the unused symbols.

Refs https://github.com/scylladb/scylladb/issues/2717

---

cmake related change, hence no need to backport.

Closes scylladb/scylladb#19539

* github.com:scylladb/scylladb:
  build: cmake: include the whole archive of zstd.a
  build: cmake: find libzstd before using it
2024-07-03 17:38:22 +03:00
Botond Dénes
fca0a58674 Merge 'Close output_stream in get_compaction_history() API handler' from Pavel Emelyanov
If an httpd body writer is called with output_stream<>, it mist close the stream on its own regardless of any exceptions it may generate while working, otherwise stream destructor may step on non-closed assertion. Stepped on with different handler, see #19541

Coroutinize the handler as the first step while at it (though the fix would have been notably shorter if done with .finally() lambda)

Closes scylladb/scylladb#19543

* github.com:scylladb/scylladb:
  api: Close response stream of get_compaction_history()
  api: Flush output stream in get_compaction_history() call
  api: Coroutinize get_compaction_history inner function
2024-07-03 17:00:26 +03:00
Kefu Chai
fd5c04acbb .github: use the latest dbuild image
scylla does not build using scylla-toolchain:fedora-38-20240521, like:

```
FAILED: repair/CMakeFiles/repair.dir/repair.cc.o
/usr/bin/clang++ -DBOOST_NO_CXX98_FUNCTION_BASE -DDEVEL -DFMT_SHARED -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/__w/scylladb/scylladb -I/__w/scylladb/scylladb/build/gen -I/__w/scylladb/scylladb/seastar/include -I/__w/scylladb/scylladb/build/seastar/gen/include -I/__w/scylladb/scylladb/build/seastar/gen/src -isystem /__w/scylladb/scylladb/abseil -O2 -std=gnu++2b -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/__w/scylladb/scylladb=. -march=westmere -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -MD -MT repair/CMakeFiles/repair.dir/repair.cc.o -MF repair/CMakeFiles/repair.dir/repair.cc.o.d -o repair/CMakeFiles/repair.dir/repair.cc.o -c /__w/scylladb/scylladb/repair/repair.cc
In file included from /__w/scylladb/scylladb/repair/repair.cc:10:
In file included from /__w/scylladb/scylladb/repair/row_level.hh:14:
In file included from /__w/scylladb/scylladb/repair/task_manager_module.hh:14:
In file included from /__w/scylladb/scylladb/tasks/task_manager.hh:20:
In file included from /__w/scylladb/scylladb/seastar/include/seastar/coroutine/parallel_for_each.hh:24:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/ranges:6161:14: error: requires clause differs in template redeclaration
    requires forward_range<_Vp>
             ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/ranges:5860:14: note: previous template declaration is here
    requires input_range<_Vp>
             ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19547
2024-07-03 16:57:22 +03:00
Kefu Chai
a88496318b alternator: use std::to_underlying() when appropriate
now that we can use C++23 features, there is no need to hardcode the
underlying type anymore.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19546
2024-07-02 18:51:29 +03:00
Kefu Chai
57def6f1e2 docs: install in non-package node
when running `make setup`, we could have following failure:
```
Installing the current project: scylla (4.3.0)
The current project could not be installed: No file/folder found for package scylla
If you do not want to install the current project use --no-root
```

because docs is not a proper python project named "scylla",
and do not have a directory structure expected by poetry. what we
expect from poetry, is to manage the dependencies for building
the document.

so, in this change, we install in the `non-package` mode when running
`poetry install`, this skips the root package, which does not exist.
as an alternative, we could put an empty `scylla.py` under `docs`
directory, but that'd be overkill. or we could pass `--no-root`
to `poetry install`, but would be ideal if we can keep the settings
in a single place.

see also https://python-poetry.org/docs/basic-usage/#operating-modes,
and https://python-poetry.org/docs/cli/#options-2, for more
details on the settings and command line options of poetry.

please note this setting was added to poetry 1.8, so the required
poetry version is updated. we might need to upgrade poetry in existing
installation.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19498
2024-07-02 18:03:20 +03:00
Michael Litvak
08b29460fc mv: skip building view updates on a pending replica
Currently, a pending replica that applies a write on a table that has
materialized views, will build all the view updates as a normal replica,
only to realize at a late point, in db::view::get_view_natural_endpoint(),
that it doesn't have a paired view replica to send the updates to. It will
then either drop the view updates, or send them to a pending view
replica, if such exists.

This work is unnecessary since it may be dropped, and even if there is a
pending view replica to send the updates to, the updates that are built
by the pending replica may be wrong since it may have incomplete
information.

This commit fixes the inefficiency by skipping the view update building
step when applying an update on a pending replica.

The metric total_view_updates_on_wrong_node is added to count the cases
that a view update is determined to be unnecessary.

The test reproduces the scenario of writing to a table and applying
the update on a pending replica, and verifies that the pending replica
doesn't try to build view updates.

Fixes scylladb/scylladb#19152

Closes scylladb/scylladb#19488
2024-07-02 13:10:18 +02:00
Nadav Har'El
d61513c41c Merge 'reader_concurrency_semaphore: make CPU concurrency configurable' from Botond Dénes
The reader concurrency semaphore restricts the concurrency of reads that require CPU (intention: they read from the cache) to 1, meaning that if there is even a single active read which declares that it needs just CPU to proceed, no new read is admitted. This is meant to keep the concurrency of reads in the cache at 1. The idea is that concurrency in the cache is not useful: it just leads to the reactor rotating between these reads, all of the finishing later then they could if they were the only active read in the cache.
This was observed to backfire in the case where there reads from a single table are mostly very fast, but on some keys are very slow (hint: collection full of tombstones). In this case the slow read keeps up the fast reads in the queue, increasing the 99th percentile latencies significantly.

This series proposes to fix this, by making the CPU concurrency configurable. We don't like tunables like this and this is not a proper fix, but a workaround. The proper fix would be to allow to cut any page early, but we cannot cut a page in the middle of a row. We could maybe have a way of detecting slow reads and excluding them from the CPU concurrency. This would be a heuristic and it would be hard to get right. So in this series a robust and simple configurable is offered, which can be used on those few clusters which do suffer from the too strict concurrency limit. We have seen it in very few cases so far, so this doesn't seem to be wide-spread.

Fixes: https://github.com/scylladb/scylladb/issues/19017

This fixes a regression introduced in 5.0, so we have to backport to all currently supported releases

Closes scylladb/scylladb#19018

* github.com:scylladb/scylladb:
  test/boost/reader_concurrency_semaphore_test: add test for live-configurable cpu concurrenc  Please enter the commit message for your changes. Lines starting
  test/boost/reader_concurrency_semaphore_test: hoist require_can_admit
  reader_concurrency_semaphore: wire in the configurable cpu concurrency
  reader_concurrency_semaphore: add cpu_concurrency constructor parameter
  db/config: introduce reader_concurrency_semahore_cpu_concurrency
2024-07-02 13:39:00 +03:00
Tzach Livyatan
6ea475ec76 Docs: Fix a typo in sstable-corruption.rst
Closes scylladb/scylladb#19515
2024-07-02 11:58:27 +02:00
Kamil Braun
bcfdeda080 Merge 'co-routinize paxos_state functions' from Gleb
Co-routinize paxos_state functions to make them more readable.

* 'gleb/coroutineze-paxos-state' of github.com:scylladb/scylla-dev:
  paxos: simplify paxos_state::prepare code to not work with raw futures
  paxos: co-routinize paxos_state::learn function
  paxos: remove no longer used with_locked_key functions
  paxos: co-routinize paxos_state::accept function
  paxos: co-routinize paxos_state::prepare function
  paxos: introduce get_replica_lock() function to take RAII guard for local paxos table access
2024-07-02 11:54:13 +02:00
Tzach Livyatan
4938927fc2 Docs: fix typo in config-commands.rst
This is a leftover from https://github.com/scylladb/scylladb/pull/19578,
which mistakenly update the "scylla" script name to "ScyllaDB"

Closes scylladb/scylladb#19583
2024-07-02 10:54:47 +02:00
Kamil Braun
edeb266fc2 Merge 'docs, config: render logging related options' from Kefu Chai
this changeset adds a filter to customize the rendering of default
values, and enables the `scylladb_cc_properties` extension to display
the logging message related options. it prepares for the further
improvements in
https://opensource.docs.scylladb.com/master/reference/configuration-parameters.html.

this changeset also prepare for the improvements requested by #19463

---

it's an improvement in the document, hence no need to backport.

Closes scylladb/scylladb#19483

* github.com:scylladb/scylladb:
  config: add descriptions for default_log_level and friends
  config: define log_to_syslog in a different line
  docs: parse log_legacy_value as declarations of config option
2024-07-02 10:44:50 +02:00
Tzach Livyatan
91401f7da5 docs: Update Scylla to ScyllaDB in *all* RST docs files v3
Closes scylladb/scylladb#19578
2024-07-01 18:04:21 +02:00
Andrei Chekun
b6aabca9a7 Add documentation how to use allure reporting
Add documentation how to install and basic usage example of the allure reporting tool.
Fix typo test/README.md

Related: scylladb/qa-tasks#1665

Depends on: scylladb/scylladb#18169

Closes scylladb/scylladb#18710
2024-07-01 16:21:50 +02:00
Gleb Natapov
9ebdb23002 raft: add more raft metrics to make debug easier 2024-07-01 10:55:22 +02:00
Kamil Braun
94bc9d4f5b Merge 'Do not expire local addres in raft address map since the local node cannot disappear' from Gleb Natapov
A node may wait in the topology coordinator queue for awhile before been
joined. Since the local address is added as expiring entry to the raft
address map it may expire meanwhile and the bootstrap will fail. The
series makes the entry non expiring.

Fixes  scylladb/scylladb#19523

Needs to be backported to 6.0 since the bug may cause bootstrap to fail.

Closes scylladb/scylladb#19557

* github.com:scylladb/scylladb:
  test: add test that checks that local address cannot expire between join request placemen and its processing
  storage_service: make node's entry non expiring in raft address map
2024-07-01 09:12:48 +02:00
Kefu Chai
90be71d959 build: cmake: include the whole archive of zstd.a
before this change, when linking scylla-main, the linker discards
the unreferenced symbols defined by zstd.cc. but we use constructor
of static variable `registerator` to register the zstd compressor,
this variable is not used from the linker's point of view. but we
do rely on the side effect of its constructor.
that's why the rules generated by CMake fails to build tests and
scylla executables with zstd support. that's why we have following
test failure:
```
boost.sstable_3_x_test.test_uncompressed_collections_read
...
[Exception] - no_such_class: unable to find class 'org.apache.cassandra.io.compress.ZstdCompressor'
 == [File] - seastar/src/testing/seastar_test.cc
 == [Line] - 43
```

in this change, we single out zstd.cc and build it as an archive,
so that scylla-main can include as a whole. an alternative is to
link scylla-main as a whole archive, but that might increase the disk
foot print when building lots of tests -- some of them do not use all
symbols exposed by scylla-main, and can potentially have smaller
size if linker can discard the unused symbols.

Refs #2717
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-01 11:51:19 +08:00
Kefu Chai
1e0af0fb7e build: cmake: find libzstd before using it
we use libzstd in zstd.cc. so let's find this library before using
it. this helps user to identify problem when preparing the building
environment, instead of being greeted by a compile-time failure.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-01 11:51:19 +08:00
Kefu Chai
b71b638b2e config: add descriptions for default_log_level and friends
so that their description can be displayed in
`reference/configuration-parameters/` web page.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-01 09:47:28 +08:00
Kefu Chai
b486f4ef01 config: define log_to_syslog in a different line
before this change, docs/_ext/scylladb_cc_properties.py parses
the options line by line, because `log_to_stdout` and `log_to_syslog`
are defined in a single line, this script is not able to parse them,
hence fails to display them on the `reference/configuration-parameters/`
web page.

after this change, these two member variables are defined on different
lines. both of them can be displayed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-01 09:47:28 +08:00
Kefu Chai
34cab80103 docs: parse log_legacy_value as declarations of config option
before this change, we only consider "named_value<type>" as the
declaration of option, and the "Type" field of the corresponding
option is displayed if its declaration is found. otherwise, "Type"
field is not rendered. but some logging related options are declared
using `log_legacy_value`, so they are missing.

after this change, they are displayed as well.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-01 09:47:28 +08:00
Kefu Chai
405f624776 cql3: define dtor of modification_statement in .cc file
before this change, we rely on the compiler to use the
definition of `cql3::attributes` to generate the defaulted
destructor in .cc file. but with clang-19, it insists that
we should have a complete definition available for defining
the defaulted destructor, otherwise it fails the build:

```
/home/kefu/.local/bin/clang++ -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT CMakeFiles/scylla-main.dir/RelWithDebInfo/table_helper.cc.o -MF CMakeFiles/scylla-main.dir/RelWithDebInfo/table_helper.cc.o.d -o CMakeFiles/scylla-main.dir/RelWithDebInfo/table_helper.cc.o -c /home/kefu/dev/scylladb/table_helper.cc
In file included from /home/kefu/dev/scylladb/table_helper.cc:10:
In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/coroutine.hh:25:
In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/future.hh:30:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/memory:78:
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:91:16: error: invalid application of 'sizeof' to an incomplete type 'cql3::attributes'
   91 |         static_assert(sizeof(_Tp)>0,
      |                       ^~~~~~~~~~~
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:398:4: note: in instantiation of member function 'std::default_delete<cql3::attributes>::operator()' requested here
  398 |           get_deleter()(std::move(__ptr));
      |           ^
/home/kefu/dev/scylladb/cql3/statements/modification_statement.hh:40:7: note: in instantiation of member function 'std::unique_ptr<cql3::attributes>::~unique_ptr' requested here
   40 | class modification_statement : public cql_statement_opt_metadata {
      |       ^
/home/kefu/dev/scylladb/cql3/statements/modification_statement.hh:40:7: note: in implicit destructor for 'cql3::statements::modification_statement' first required here
/home/kefu/dev/scylladb/cql3/statements/modification_statement.hh:28:7: note: forward declaration of 'cql3::attributes'
   28 | class attributes;
      |       ^
```

so, in this change, we define the destructor in .cc file, where
the complete definition of `cql3::attributes` is available.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19545
2024-06-30 19:35:05 +03:00
Avi Kivity
0ce00ebfbd Merge 'Close output stream in task manager's API get_tasks handler' from Pavel Emelyanov
If client stops reading response early, the server-side stream throws but must be closed anyway. Seen in another endpoint and fixed by #19541

Closes scylladb/scylladb#19542

* github.com:scylladb/scylladb:
  api: Fix indentation after previous patch
  api: Close response stream on error
  api: Flush response output stream before closing
2024-06-30 19:34:00 +03:00
Avi Kivity
3a85d88b68 Merge 'Close output_stream in get_snapshot_details() API handler' from Pavel Emelyanov
All streams used by httpd handlers are to be closed by the handler itself, caller doesn't take care of that.

fixes: #19494

Closes scylladb/scylladb#19541

* github.com:scylladb/scylladb:
  api: Fix indentation after previous patch
  api: Close output_stream on error
  api: Flush response output stream before closing
2024-06-30 19:33:16 +03:00
Avi Kivity
2fbc532e4d Update tools/python3 submodule
* tools/python3 3e833f1...18fa79e (1):
  > reloc: use `--add-rpath` and not `--set-rpath`
2024-06-30 19:31:23 +03:00
Kefu Chai
77d2d5821d build: cmake: do not mark cqlsh noarch
in 3c7af287, cqlsh's reloc package was marked as "noarch", and its
filename was updated accordingly in `configure.py`, so let's update
the CMake building system accordingly.

this change should address the build failure of

```
08:48:14  [3325/4124] Generating ../Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz
08:48:14  FAILED: Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz /jenkins/workspace/scylla-master/scylla-ci/scylla/build/Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz
08:48:14  cd /jenkins/workspace/scylla-master/scylla-ci/scylla/build/dist && /usr/bin/cmake -E copy /jenkins/workspace/scylla-master/scylla-ci/scylla/tools/cqlsh/build/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz /jenkins/workspace/scylla-master/scylla-ci/scylla/build/Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz
08:48:14  Error copying file "/jenkins/workspace/scylla-master/scylla-ci/scylla/tools/cqlsh/build/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz" to "/jenkins/workspace/scylla-master/scylla-ci/scylla/build/Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz".
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19544
2024-06-30 19:26:54 +03:00
Nadav Har'El
44e036c53c alternator: fix "/localnodes" to use broadcast_rpc_address
Alternator's non-standard "/localnodes" HTTP request returns a list of
live nodes on this DC, to consider for load balancing. The returned
node addresses should be external IP addresses usable by the clients.
Scylla has a configuration parameter - broadcast_rpc_address - which
defines for a node an external IP address. If such a configuration
exists, we need to use those external IP addresses, not the internal
ones.

Finding these broadcast_rpc_address of all nodes is easy, because the
gossiper already gossips them.

This patch also tests the new feature:
1. The existing single-node test is extended to verify that without
   broadcast_rpc_address we get the usual IP address.
2. A new two-node test is added to check that when broadcast_rpc_address
   is configured, we get that address and not the usual internal IP
   addresses.

Fixes #18711.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-06-30 18:38:15 +03:00
Nadav Har'El
2a2e8167c8 gossiper: fix get_rpc_address() for this node
Commit dd46a92e23 introduced a function gossiper::get_rpc_address()
as a shortcut for get_application_state_ptr(endpoint, RPC_ADDRESS) -
i.e., it fetches the endpoint's configured broadcast_rpc_address
(despite its confusing name, this is the endpoint's external IP address
that clients can use to make CQL connections).

But strangely, the implementation get_rpc_address() made an exception
for asking about the *current* host - where instead of getting this
node's broadcast_rpc_address, it returns its internal address, which
is not what this function was supposed to do - it's not useful for
it to do one thing for this node, and a different thing for other
nodes, and when I wrote code that uses this function (see the next
patch), this resulted in wrong results for the current node.

The fix is simple - drop the wrong if(), and get the
broadcast_rpc_address stored by the gossiper unconditionally - the
gossiper knows it for this node just like for other nodes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-06-30 18:38:15 +03:00
Gleb Natapov
3f136cf2eb test: add test that checks that local address cannot expire between join request placemen and its processing 2024-06-30 15:52:23 +03:00
Gleb Natapov
5d8f08c0d7 storage_service: make node's entry non expiring in raft address map
Local address map entry should never expire in the address map.
2024-06-30 15:08:50 +03:00
Kefu Chai
947e28146d dbuild: pass --tty when running in interactive mode
podman does not allocate a tty by default, so without `-t` or `--tty`,
one cannot use a functional terminal when interacting with the
container. that what one can expect when running `dbuild -i --`, and
we are greeted with :

```
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
```

after this change, one can enjoy the good-old terminal as usual
after being dropped to the container provided by `dbuild -i --`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19550
2024-06-30 12:06:55 +03:00
Pavel Emelyanov
d034cde01f Merge 'build: update C++ standard to C++23' from Avi Kivity
Switch the C++ standard from C++20 to C++23. This is straightforward, but there are a few
fallouts (mostly due to std::unique_ptr that became constexpr) that need to be fixed first.

Internal enhancement - no backport required

Closes scylladb/scylladb#19528

* github.com:scylladb/scylladb:
  build: switch to C++23
  config: avoid binding an lvalue reference to an rvalue reference
  readers: define query::partition_slice before using it in default argument
  test: define table_for_tests earlier
  compaction: define compaction_group::table_state earlier
  compaction: compaction_group: define destructor out-of-line
  compaction_manager: define compaction_manager::strategy_control earlier
2024-06-28 18:02:33 +03:00
Avi Kivity
cf66f233aa build: remove aarch64 workarounds
In 90a6c3bd7a ("build: reduce release mode inline tuning on aarch64") we
reduced inlining on aarch64, due to miscompiles.

In 224a2877b9 ("build: disable -Og in debug mode to avoid coroutine
asan breakage") we disabled optimization in debug mode, due to miscompiles.

With clang 18.1, it appears the miscompiles are gone, and we can remove
the two workarounds.

Closes scylladb/scylladb#19531
2024-06-28 17:53:51 +03:00
Pavel Emelyanov
b4f9387a9d api: Close response stream of get_compaction_history()
The function must close the stream even if it throws along the way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-28 16:56:53 +03:00
Pavel Emelyanov
6d4ba98796 api: Flush output stream in get_compaction_history() call
It's currently implicitly flushed on its close, but in that case close
can throw while flusing. Next patch wants close not to throw and that's
possible if flushing the stream in advance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-28 16:55:58 +03:00
Pavel Emelyanov
acb351f4ee api: Coroutinize get_compaction_history inner function
The handler returns a function which is then invoked with output_stream
argument to render the json into. This function is converted into
coroutine. It has yet another inner lambda that's passed into
compaction_manager::get_compaction_history() as consumer lambda. It's
coroutinized too.

The indentation looks weird as preparation for future patching.
Hopefullly it's still possible to understand what's going on.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-28 16:53:46 +03:00
Pavel Emelyanov
1be8b2fd25 api: Fix indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-28 16:07:21 +03:00