The code tries to build as "neighbors" an unordered_map from an iterator
of std::tuple, instead of the correct std::pair. Apparently, the tuples
are transparently converted to pairs on the newest compilers and the
whole works, but on slightly older compilers (like the one on Fedora 39)
Scylla no longer compiles - the compiler complains it can't convert a
tuple to a pair in this context.
So fix the code to use pairs, not tuples, and it fixes the build on
Fedora 39.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#20319
in `configure.py`, a set of options are specified when configuring
seastar, but not all of them were ported to scylla's CMake building
system.
for instance, `configure.py` explicitly disables io_uring reactor
backend at build time, but the CMake-based system does not.
so, in this change, in order to preserve the existing behavior, let's
port the two previously missing option to CMake-based building system
as well.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20288
Attempting to read a partition via `SELECT * FROM MUTATION_FRAGMENTS()`, which the node doesn't own, from a table using tablets causes a crash.
This is because when using tablets, the replica side simply doesn't handle requests for un-owned tokens and this triggers a crash.
We should probably improve how this is handled (an exception is better than a crash), but this is outside the scope of this PR.
This PR fixes this and also adds a reproducer test.
Fixes: https://github.com/scylladb/scylladb/issues/18786
Fixes a regression introduced in 6.0, so needs backport to 6.0 and 6.1
Closesscylladb/scylladb#20109
* github.com:scylladb/scylladb:
test/tablets: Test that reading tablets' mutations from MUTATION_FRAGMENTS works
replica/mutation_dump: enfore pinning of effective replication map
replica/mutation_dump: handle un-owned tokens (with tablets)
When executing reversed queries, a native revered format shall be used. Therefore, the table schema and the clustering key bounds are reversed before a partition slice and a read command are constructed.
It is, however, possible to run a reversed query passing a table schema but only when there are no restrictions on the clustering keys. In this particular situation, the query returns correct results. Since the current alternator tests in test.py do not imply any restrictions, this situation was not caught during development of https://github.com/scylladb/scylladb/pull/18864.
Hence, additional tests are provided that add clustering keys restrictions when executing reversed queries to capture such errors earlier than in dtests.
Additional manual tests were performed to test a mixed-node cluster (with alternator API enabled in Scylla on each node):
1. 2-node cluster with one node upgraded: reverse read queries performed on an old node
2. 2-node cluster with one node upgraded: reverse read queries performed on a new node
3. 2-node cluster with one node upgraded and all its sstable files deleted to trigger repair: reverse read queries performed on an old node
4. 2-node cluster with one node upgraded and all its sstable files deleted to trigger repair: reverse read queries performed on a new node
All reverse read queries above consists of:
- single-partition reverse reads with no clustering key restrictions, with single column restrictions and multi column restrictions both with and without paging turned on
The exact same tests were also performed on a fully upgraded cluster.
Fixes https://github.com/scylladb/scylladb/issues/20191
No backport is required as this is a complementary patch for the series https://github.com/scylladb/scylladb/pull/18864 that did not require backporting.
Closesscylladb/scylladb#20205
* github.com:scylladb/scylladb:
test_query.py: Test reverse queries with clustering key bounds
alternator::do_query Add additional trace log
alternator::do_query: Use native reversed format
alternator::do_query Rename schema with table_schema
Currently, the `force` property of the `source_dc` rebuild option
is lost and `raft_topology_cmd_handler` has no way to know
if it was given or not.
This in turn can cause rebuild to fail, even when `--force`
is set by the user, where it would succeed with gossip
topology changes, based on the source_dc --force semantics.
Fixesscylladb/scylladb#20242
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#20249
* seastar a7d81328...83e6cdfd (29):
> fair_queue: Export the number of times class was activated
> tests/unit: drop support of C++17
> remove vestigial OSv support
> cmake: undefine _FORTIFY_SOURCE on thread.cc
> container_perf: a benchmark for container perf
> io_sink: use chunked_fifo as _pending_io container
> chunked_fifo: implement clear in terms of pop_n
> chunked_fifo: pop_front_n
> io_sink: use iteration instead of indexing
> json2code_test: choose less popular port number
> ioinfo: add '--max-reqsize' parameter
> treewide: drop the support of fmtlib < 8.0.0
> build: bump up the required fmtlib version to 8.1.1
> conditional-variable: align when() and wait() behaviour in case of a predicate throwing an exception
> stall-analyser: add output support for flamegraph
> reactor: Add --io-completion-notify-ms option
> io_queue: Stall detector
> io_queue: Keep local variable with request execution delay
> io_queue: Rename flow ratio timer to be more generic
> reactor: Export _polls counter (internally)
> dns: de-inline dns_resolver::impl methods
> dns: enter seastar::net namespace
> dnf: drop compatibility for c-ares <= 1.16
> reactor: add missing includes of noncopyable_function.hh
> reactor: Reset one-shot signal to DFL before handling
> future: correctly document nested exception type emitted by finally()
> modules: fix FATAL_ERROR on compiler check
> seastar.cc: include fmt/ranges.h
> pack io_request
Closesscylladb/scylladb#20300
`dist-check` tests the generated rpm packages by installing them in a centos 7 container. but this script is terribly outdated
- centos 7 is deprecated. we should use a new distro's latest stable release.
- cqlsh was added to the family of rpms a while ago. we should test it as well.
- the directory hierarchy has been changed. we should read the artifacts from the new directories.
- cmake uses a different directory hierarchy. we should check the directory used by cmake as well.
to address these breaking changes, the scripts are updated accordingly.
---
this change gives an overhaul to a test, which is not used in production. so no need to backport.
Closesscylladb/scylladb#20267
* github.com:scylladb/scylladb:
tools/testing: add cqlsh rpm
tools/testing: adapt to cmake build directory
tools/testing: test with rockylinux:9 not centos:7
tools/testing: correct the paths to rpm packages and SCYLLA-*-FILE
dist-check: add :z option when mapping volume
Currently, when a view update backlog of one replica is full, the write is still sent by the coordinator to all replicas. Because of the backlog, the write fails on the replica, causing inconsistency that needs to be fixed by repair. To avoid these inconsistencies, this patch adds a check on the coordinator for overloaded replicas. As a result, a write may be rejected before being sent to any replicas and later retried by the user, when the replica is no longer overloaded.
This patch does not remove the replica write failures, because we still may reach a full backlog when more view updates are generated after the coordinator check is performed and before the write reaches the replica.
Fixesscylladb/scylladb#17426Closesscylladb/scylladb#18334
* github.com:scylladb/scylladb:
mv: test the view update behavior
mv: add test for admission control
storage_proxy: return overloaded_exception instead of throwing
mv: reject user requests by coordinator when a replica is overloaded by MVs
This series fixes an issue where histogram Summaries return an infinite value.
It updated the quantile calculation logic to address cases where values fall into the infinite bucket of a histogram.
Now, instead of returning infinite (max int), the calculation will return the last bucket limit, ensuring finite outputs in all cases.
The series adds a test for summaries with a specific test case for this scenario.
Fixes#20255
Need backport to 6.0, 6.1 and 2023.1 and above
Closesscylladb/scylladb#20257
* github.com:scylladb/scylladb:
test/estimated_histogram_test Add summary tests
utils/histogram.hh: Make summary support inifinite bucket.
instead of using the default `argparse.HelpFormatter`, let's
use `ArgumentDefaultsHelpFormatter`, so that the default values
of options are displayed in the help messages.
this should help developer understand the behavior of the script
better.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20262
to achieve feature parity with our existing building system, we
need to implement a new build target "dist-check" in the CMake-based
building system.
in this change, "dist-check" is added to CMake-based building system.
unlike the rules generated by `configure.py`, the `dist-check` target
in CMake depends on the dist-*-rpm targets. the goal is to enable user
to test `dist-check` without explicitly building the artifacts being
tested.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20266
in 57def6f1, we specified "package-mode" for poetry, but this option
was introduced in poetry 1.8.0, as the "non-package" mode support.
see https://github.com/python-poetry/poetry/releases/tag/1.8.0
this change practically bumps up the minimum required poetry version
to 1.8.0, we did update `pyproject.tombl` to reflect this change.
but wefailed to update the `Makefile`.
in this change, we update `Makefile` to ensure that user which happens
have an older version of poetry can install the version which supports
this version when running `make setupenv`.
Refs scylladb/scylladb#20284
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20286
we switched from `circular_buffer` to `chunked_fifo` to present
`io_sink::_pending_io` in the latest seastar now. to be prepared for
this change, let's
* add `chunked_fifo` class in `scylla-gdb.py`.
* use `circular_buffer` as a fallback of `chunked_fifo`. instead of
doing this the other way around, we try to send the message that
the latest seastar uses `chunked_fifo`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20280
Add parameter --cluster-pool-size that can control pool size for all
PythonTestSuite tests. By default, the pool size set to 10 for most of
the suites, but this is too much for laptops. So this parameter can be
used to lower the pool size and not to freeze the system. Additionally,
the environment variable CLUSTER_POOL_SIZE was added for a convenient way
to limit pool size in the system without the need to provide each time an
additional parameter.
Related: https://github.com/scylladb/scylladb/pull/20276Closesscylladb/scylladb#20289
To prevent stalls due to large number of tokens.
For example, large cluster with say 70 nodes can have
more than 16K tokens.
Fixes#19757Closesscylladb/scylladb#19758
* github.com:scylladb/scylladb:
abstract_replication_strategy: make get_ranges async
database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param
compaction: task_manager_module: open code maybe_get_keyspace_local_ranges
alternator: ttl: token_ranges_owned_by_this_shard: let caller make the ranges_holder
alternator: ttl: can pass const gms::gossiper& to ranges_holder
alternator: ttl: ranges_holder_primary: unconstify _token_ranges member
alternator: ttl: refactor token_ranges_owned_by_this_shard
Removed people that no longer contribute to the scylladb.git and added/substituted reviewers responsible for maintaining the frontend components.
No need to backport, this is just an information for the github tool.
Closesscylladb/scylladb#20136
* github.com:scylladb/scylladb:
codeowners: add appropriate reviewers to the cluster components
codeowners: add appropriate reviewers to the frontend components
codeowners: fix codeowner names
codeowners: remove non contributors
table_helper has some quite awkward code, improve it a little.
Code cleanup, so no reason to backport.
Closesscylladb/scylladb#20194
* github.com:scylladb/scylladb:
table_helper: insert(): improve indentation
table_helper: coroutinize insert()
table_helper: coroutinize cache_table_info()
table_helper: extract try_prepare()
Currently "table removal" is logged as a reason of compaction stop for table drop,
tablet cleanup and tablet split. Modify log to reflect the reason.
Closesscylladb/scylladb#20042
* github.com:scylladb/scylladb:
test: add test to check compaction stop log
compaction: fix compaction group stop reason
we need to test the installation of cqlsh rpm. also, we should use the
correct paths of the generated rpm packages.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
cmake uses a different arrangement, so let's check for the existence
of the build directory and fallback to cmake's build directory.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
the centos image repos on docker has been deprecated, and the
repo for centos7 has been removed from the main CentOS servers.
so we are either not able to install packages from its default repo,
without using the vault mirror, or no longer to pull its image
from dockerhub.
so, in this change
* we switch over to rockylinux:9, which is the latest stable release
of rockylinux, and rockylinux is a popular clone of RHEL, so it
matches our expectation of a typical use case of scylla.
* use dnf to manage the packages. as dnf is the standard way to manage
rpm packages in modern RPM-based distributions.
* do not install deltarpm.
delta rpms are was not supported since RHEL8, and the `deltarpm` package
is not longer available ever since. see
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html-single/considerations_in_adopting_rhel_8/index#ref_the-deltarpm-functionality-is-no-longer-supported_notable-changes-to-the-yum-stack
as a sequence, this package does not exist in Rockylinux-9.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
when building with the rules generated from `configure.py`,
these files are located under tools' own build directory.
so correct them.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
if SELinux is enabled on the host, we'd have following failure when
running `dist-check.sh`:
```
+ podman run -i --rm -v /home/kefu/dev/scylladb:/home/kefu/dev/scylladb docker.io/centos:7 /bin/bash -c 'cd /home/kefu/dev/scylladb && /home/kefu/dev/scylladb/tools/testing/dist-check/docker.io/centos-7.sh --mode debug'
/bin/bash: line 0: cd: /home/kefu/dev/scylladb: Permission denied
```
to address the permission issue, we need to instruct podman to
relabel the shared volume, so that the container can access
the shared volume.
see also https://docs.podman.io/en/stable/markdown/podman-pod-create.1.html#volume-v-source-volume-host-dir-container-dir-options
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, none of the target generated by CMake-based
building system runs `test.py`. but `build.ninja` generated directly
by `configure.py` provides a target named `test`, which runs the
`test.py` with the options passed to `configure.py`.
to be more compatible with the rules generated by `configure.py`,
in this change
* do not include "CTest" module, as we are not using CTest for
driving tests. we use the homebrew `test.py` for this purpose.
more importantly, the target named "test" is provided by "CTest".
so in order to add our own "test" target, we cannot use "CTest"
module.
* add a target named "test" to run "test.py".
* add two CMake options so we can customize the behavior of "test.py",
this is to be compatible with the existing behavior of `configure.py`.
Refs scylladb/scylladb#2717
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20263
This adds minimal implementation of the start-backup API call.
The method starts a task that uploads all files from the given keyspace's snapshot to the requested endpoint/bucket. Arguments are:
- endpoint -- the ID in object_store.yaml config file
- bucket -- the target bucket to put objects into
- keyspace -- the keyspace to work on
- snapshot -- the method assumes that the snapshot had been already taken and only copies sstables from it
The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion (hint: it's good to have non-zero TTL value to make sure fast backups don't finish before the caller manages to call wait_task API).
Sstables components are scanned for all tables in the keyspace and are uploaded into the /bucket/${cf_name}/${snapshot_name}/ path.
refs: #18391Closesscylladb/scylladb#19890
* github.com:scylladb/scylladb:
tools/scylla-nodetool: add backup integration
docs: Document the new backup method
test/object_store: Test that backup task is abortable
test/object_store: Add simple backup test
test/object_store: Move format_tuples()
test/pylib: Add more methods to rest client
backup-task: Make it abortable (almost)
code: Introduce backup API method
database: Export parse_table_directory_name() helper
database: Introduce format_table_directory_name() helper
snapshot-ctl: Add config to snapshot_ctl
snapshot-ctl: Add sstables::storage_manager dependency
snapshot-ctl: Maintain task manager module
snapshot-ctl: Add "snapshots" logger
snapshot-ctl: Outline stop() method and constructor
snapshot-ctl: Inline run_snapshot_list<>
test/cql_test_env: Export task manager from cql test env
task_manager: Print task ttl on start (for debugging)
docs: Update object_storage.md with AWS_ environment
docs: Restructure object_storage.md
since the rules generated by `configure.py` has this target, we need
to have an equivalent target as well in CMake-based buidling system.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20265
Increase pool size changes were recently reverted because of the flakiness for the test_gossip_boot test. Test started
to fail on adding the node to the cluster without any issues in the Scylla log file. In test logs it looked like the
installation process for the new node just hanged. After investigating the problem, I've found out that the issue is that
test.py was draining the io_executor pool for cleaning the directory during install that was set to eight workers. So
to fix the issue, io_executor pool should be increased to more or less the same ratio as it was: doubled cluster pool size.
Closesscylladb/scylladb#20276
so that we can use {fmt} with it without the help of fmt::streamed.
also since we have a proper formatter for replication_strategy_type,
let's implement
`formatter<vnode_effective_replication_map::factory_key>`
as well.
since there are no callers of these two operator<<, let's drop
them in this change.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20248
To prevent stalls due to large number of tokens.
For example, large cluster with say 70 nodes can have
more than 16K tokens.
Fixes#19757
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Prepare for making the function async.
Then, it will need to hold on to the erm while getting
the token_ranges asynchronously.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is used only here and can be simplified by
checking if the keyspace replication strategy
is per table by the caller.
Prepare for making get_keyspace_local_ranges async.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add static `make` methods to ranges_holder_{primary,secondary}
and use them to make the ranges objects and pass them
to `token_ranges_owned_by_this_shard`, rather than letting
token_ranges_owned_by_this_shard invoke the right constructor
of the ranges_holder class.
Prepare for making `make` async.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Rather than holding a variant member (and defining
both ranges_holder_{primary,secondary} in both
specilizations of the class, just make the internal
ranges_holder class first-class citizens
and parameterize the `token_ranges_owned_by_this_shard`
template by this class type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
table_helper::cache_table_info() is fairly convoluted. It cannot be
easily coroutinized since it invokes asynchronous functions in a
catch block, which isn't supported in coroutines. To start to break it
down, extract a block try_prepare() from code that is called twice. It's
both a simplification and a first step towards coroutinization.
The new try_prepare() can return three values: `true` if it succeeded,
`false` if it failed and there's the possibility of attempting a fallback,
and an exception on error.
The `keyspace_compaction` method incorrectly appends the column family
parameter to the URL using a regular string, `"?cf={table}"`, instead of
an f-string, `f"?cf={table}"`. As a result, the column family name is
sent as `{table}` to the server, causing the compaction request to fail.
Fix this issue by passing the parameter to the POST request using a
dictionary instead of appending it to the URL.
Fixes#20264
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#20243
before this change, we use the default options when creating `test_env`,
and the default options enable `use_uuid`. but the modes of
`perf-sstables` involving reads assumes that the identifiers are
deterministic. so that the previously written sstables using the "write"
mode can be read with the modes like "index_read", which just uses
`test_env::make_sstable()` in `load_sstables()`, and under the hood,
`test_env::make_sstable()` uses `test_env::new_generation()` for
retrieving the next identifier of sstable. when using integer-base
identifier, this works. as the sstable identifiers are generated
from a monotonically increasing integer sequence, where the identifiers
are deterministic. but this does not apply anymore when the UUID-based
identifiers are used, as the identifiers are generated with a
pseudorandom generator of UUID v1.
in this change, to avoid relying on the determinism of the integer-based
sstable identifier generation, we enumerate sstables by listing the
given directory, and parse the path for their identifier.
after this change, we are able to support the UUID-based sstable
identifier.
another option is disable the UUID-based sstable identifier when
loading sstables. the upside is that this approach is minimal and
straightforward. but the downside is that it encodes the assumption
in the algorithm implicitly, and could be confusing -- we create
a new generation for loading an existing sstable with this generation.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20183
Now the endpoint hanler gets the value from db::config which is not nice
from several perspectives. First, it gets config (ab)using database.
Second, it's compaction manager that "knows" its throughput, global
config is the initial source of that information.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#20173
Currently the major compaction task impl grabs this (non-updateable)
value from db::config. That's not good, all services including
compaction manager have their own configs from which they take options.
Said that, this patch puts the said option onto
compaction_manager::config, makes use of it and configures one from
db::config on start (and tests).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#20174
This is prerequisite for "restore from object storage" feature. In order to collect the sstables in bucket one would need to list the bucket contents with the given prefix. The ListObjectsV2 provides a way for it and here's the respective s3::client extension.
Closesscylladb/scylladb#20120
* github.com:scylladb/scylladb:
test: Add test for s3::client::bucket_lister
s3_client: Add bucket lister
s3_client: Encode query parameter value for query-string
in 947e2814, we pass `--tty` as long as we are using podman _or_
we are in interactive mode. but if we build the tree using podman
using jenkins, we are seeing that ninja is displaying the output
as if it's in an interactive mode. and the output includes ASCII
escape codes. this is distracting.
the reason is that we
* are using podman, and
* ninja tells if it should displaying with a "smart" terminal by
checking istty() and the "TERM" environmental variable.
so, in this change, we add --tty only if
* we are in the interactive mode.
* or stdin is associated with a terminal. this is the use case
where user uses dbuild to interactively build scylla
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20196
in 30e82a81, we add a contraint to the template parameter of
boost_test_print_type() to prevent it from being matched with
types which can be formatted with operator<<. but it failed to
work. we still have test failure reports like:
```
[Exception] - critical check ['s', 's', 't', '_', 'm', 'r', '.', 'i', 's', '_', 'e', 'n', 'd', '_', 'o', 'f', '_', 's', 't', 'r', 'e', 'a', 'm', '(', ')'] has failed
```
this is not what we expect. the reason is that we passed the template
parameters to the `has_left_shift` trait in the wrong order, see
https://live.boost.org/doc/libs/1_83_0/libs/type_traits/doc/html/boost_typetraits/reference/has_left_shift.html.
we should have passed the lhs of operator<< expression as first
parameter, and rhs the second.
so, in this change, we correct the type constraint by passing the
template parameter in the right order, now the error message looks
better, like:
```
test/boost/mutation_query_test.cc(110): error: in "test_partition_query_is_full": check !partition_slice_builder(*s) .with_range({}) .build() .is_full() has failed
```
it turns out boost::transformed_range<> is formattable with operator<<,
as it fulfills the constraints of `boost::has_left_shift<ostream, R>`,
but when printing it, the compiler fails when it tries to insert the
elements in the range to the output stream.
so, in order to workaround this issue, we add a specialization for
`boost::transformed_range<F, R`.
also, to improve the readability, we reimplement the `has_left_shift<>`
as a concept, so that it's obvious that we need to put both the output
stream as the first parameter.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20233