before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for
* raft_call
* raft_read
* network_majority_grudge
* reconfiguration
* stop_crash
* operation::thread_id
* append_seq
* AppendReg::append
* AppendReg::ret
* operation::either_of<Ops...>
* operation::exceptional_result<Op>
* operation::completion<Op>
* operation::invocable<Op>
and drop their operator<<:s.
in which,
* `operator<<` for append_entry is never used. so it is removed.
* `operator<<` for `std::monostate` and `std::variant` are dropped. as we are now using their counterparts in {fmt}.
* stop_crash::result_type 's `fmt::formatter` is not added, as we cannot define a partial specialization of `fmt::formatter` for a nested class for a template class. we will tackle this struct in another change.
Refs #13245Closesscylladb/scylladb#17884
* github.com:scylladb/scylladb:
test: raft: generator: add fmt::formatter:s
test: randomized_nemesis_test: add fmt::formatter for some types
test: randomized_nemesis_test: add fmt::formatter for seastar::timed_out_error
raft: add fmt::formatter for error classes
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* operation::either_of<Ops...>
* operation::exceptional_result<Op>
* operation::completion<Op>
* operation::invocable<Op>
and drop their operator<<:s.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* raft_call
* raft_read
* network_majority_grudge
* reconfiguration
* stop_crash
* operation::thread_id
* append_seq
* append_entry
* AppendReg::append
* AppendReg::ret
and drop their operator<<:s.
in which,
* `operator<<` for `std::monostate` and `std::variant` are dropped.
as we are now using their counterparts in {fmt}.
* stop_crash::result_type 's `fmt::formatter` is not added, as we
cannot define a partial specialization of `fmt::formatter` for
a nested class for a template class. we will tackle this struct
in another change.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatter for `seastar::timed_out_error`,
which will be used by the `fmt::formatter` for `std::variant<...>`.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatter for classes derived from
`raft::error`. since {fmt} v10 defines the formatter for all classes
derived from `std::exception`, the definition is provided only when
the tree is compiled with {fmt} < 10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The token ring table is a virtual table (`system.token_ring`), which contains the ring information for all keyspaces in the system. This is essentially an alternative to `nodetool describering`, but since it is a virtual table, it allows for all the usual filtering/aggregation/etc. that CQL supports.
Up until now, this table only supported keyspaces which use vnodes. This PR adds support for tablet keyspaces. To accommodate these keyspaces a new `table_name` column is added, which is set to `ALL` for vnodes keyspaces. For tablet keyspaces, this contains the name of the table.
Simple sanity tests are added for this virtual table (it had none).
Fixes: #16850Closesscylladb/scylladb#17351
* github.com:scylladb/scylladb:
test/cql-pytest: test_virtual_tables: add test for token_ring table
db/virtual_tables: token_ring_table: add tablet support
db/virtual_tables: token_ring_table: add table_name column
db/virtual_tables: token_ring_table: extract ring emit
service/storage_service: describe_ring_for_table(): use topology to map hostid to ip
In topology on raft, management of CDC generations is moved to the topology coordinator.
We need to verify that the CDC keeps working correctly during the upgrade for topology on the raft.
A similar change will be made in the topology recovery test. It will reuse
the `start_writes_to_cdc_table` function.
Ref #17409Closesscylladb/scylladb#17828
This PR contains few fixes and improvment seen during
https://github.com/scylladb/scylladb/issues/15902 label addtion
When we add a label to an issue, we go through all PR.
1) Setting PR base to `master` (release PR are not relevant)
2) Since for each Issue we have only one PR, ending the search after a
match was found
3) Make sure to skip PR with empty body (mainly debug one)
4) Set backport label prefix to `backport/`
Closesscylladb/scylladb#17912
Introduces relative link support for individual properties listed on the configuration properties page. For instance, to link to a property from a different document, use the syntax :ref:`memtable_flush_static_shares <confprop_memtable_flush_static_shares>`.
Additionally, it also adds support for linking groups. For example, :ref:`Ungrouped properties <confgroup_ungrouped_properties>`.
Closesscylladb/scylladb#17753
> Revert "build: do not provide zlib as an ingredient"
> Fix reference to sstring type in tutorial about concurrency in coroutines
> Merge 'Adding a Metrics tester app' from Amnon Heiman
> cooking.sh: do not quote backtick in here document
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17887
Affects load-and-stream for tablets only.
The intention is that only this loop is responsible for detecting
exhausted sstables and then discarding them for next iterations:
while (sstable_it != _sstables.rend() && exhausted(*sstable_it)) {
sstable_it++;
}
But the loop which consumes non exhausted sstables, on behalf of
each tablet, was incorrectly advancing the iterator, despite the
sstable wasn't considered exhausted.
Fixes#17733.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#17899
When we open a backport PR we should make sure the patch contains a ref to the issue it suppose to fix in order to make sure we have more accurate backport information
This action will only be triggered when base branch is `branch-*`
If `Fixes` are missing, this action will fail and notify the author.
Ref: https://github.com/scylladb/scylla-pkg/issues/3539Closesscylladb/scylladb#17897
The test dtest materialized_views_test.py::TestMaterializedViews::
test_mv_populating_from_existing_data_during_truncate reproduces an
assertion failure, and crash, while doing a CREATE MATERIALIZED VIEW
during a TRUNCATE operation.
This patch fixes the crash by removing the assert() call for a view
(replacing it by a warning message) - we'll explain below why this is fine.
Also for base tables change we change the assertion to an on_internal_error
(Refs #7871).
This makes the test stop crashing Scylla, but it still fails due to
issue #17635.
Let's explain the crash, and the fix:
The test starts TRUNCATE on table that doesn't yet have a view.
truncate_table_on_all_shards() begins by disabling compaction on
the table and all its views (of which there are none, at this
point). At this point, the test creates a new view is on this table.
The new view has, by default, compaction enabled. Later, TRUNCATE
calls discard_sstables() on this new view, asserts that it has
compaction disabled - and this assertion fails.
The fix in this patch is to not do the assert() for views. In other words,
we acknowledge that in this use case, the view *will* have compactions
enabled while being truncated. I claim that this is "good enough", if we
remember *why* we disable compaction in the first place: It's important
to disable compaction while truncating because truncating during compaction
can lead us to data resurection when the old sstable is deleted during
truncation but the result of the compaction is written back. True,
this can now happen in a new view (a view created *DURING* the
truncation). But I claim that worse things can happen for this
new view: Notably, we may truncate a view and then the ongoing
view building (which happens in a new view) might copy data from
the base to the view and only then truncate the base - ending up
with an empty base and non-empty view. This problem - issue #17635 -
is more likely, and more serious, than the compaction problem, so
will need to be solved in a separate patch.
Fixes#17543.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#17634
When a sstable set is cloned, we don't want a change in cloned set
propagating to the former one.
It happens today with partitioned_sstable_set::_all_runs, because
sets are sharing ownership of runs, which is wrong.
Let's not violate clone semantics by copying all_runs when cloning.
Doesn't affect data correctness as readers work directly with
sstables, which are properly cloned. Can result in a crash in ICS
when it is estimating pending tasks, but should be very rare in
practice.
Fixes#17878.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#17879
Fixes#17569
Tests are not closing file descriptor after it finishes. This leads to inability to continue tests since the default value for opened files in Linux is 1024. Issue easy to reproduce with the next command:
```
$ ./test.py --mode debug test_native_transport --repeat 1500
```
After fix applied all tests are passed with a next command:
```
$ ./test.py --mode debug test_native_transport --repeat 10000
```
Closesscylladb/scylladb#17798
Document the manual upgrade procedure that is required to enable
consistent cluster management in clusters that were upgraded from an
older version to ScyllaDB Open Source 6.0. This instruction is placed in
previously placeholder "Enable Raft-based Topology" page which is a part
of the upgrade instructions to ScyllaDB Open Source 6.0.
Add references to the new description in the "Raft Consensus Algorithm
in ScyllaDB" document in relevant places.
Extend the "Handling Node Failures" document so that it mentions steps
required during recovery of a ScyllaDB cluster running version 6.0.
Fixes: scylladb/scylladb#17341Closesscylladb/scylladb#17624
After merging https://github.com/scylladb/scylladb/pull/17365, all backport labels should be added to PR (before we used to add backport labels to the issues).
Adding a GitHub action which will be triggered in the following conditions only:
1) The base branch is `master` or `next`
2) Pull request events:
- opened: For every new PR that someone opens, we will sync all labels from the linked issue (if available)
- labeled: This role only applies to labels with the `backport/` prefix. When we add a new label for the backport we will update the relevant issue or PR to get them both to sync
- unlabeled: Same as `labeled` only applies to the `backport/` prefix. When we remove a label for backport we will update the relevant issue or pr
Closesscylladb/scylladb#17715
sstables_manager now depends on system_keyspace for access to the
system.sstables table, needed by object storage. This violates
modularity, since sstables_manager is a relatively low-level leaf
module while system_keyspace integrates large parts of the system
(including, indirectly, sstables_manager).
One area where this is grating is sstables::test_env, which has
to include the much higher level cql_test_env to accommodate it.
Fix this by having sstables_manager expose its dependency on
system_keyspace as an interface, sstables_registry, and have
system_keyspace implement the glue logic in
system_keyspace_sstables_manager.
Closesscylladb/scylladb#17868
This commit updates the Upgrade ScyllaDB Image page.
- It removes the incorrect information that updating underlying OS packages is mandatory.
- It adds information about the extended procedure for non-official images.
Closesscylladb/scylladb#17867
group0 operations a valid on shard 0 only. Assert that. We already do
that in the version of the function that gets abort source.
Message-ID: <ZeCti70vrd7UFNim@scylladb.com>
Lot's of BOOST_REQUIRES in this test require some integers to be in some
eq/gt/le relations to each other. And one place that compares rack names
as strings. Using more verbose boost checkers is preferred in such cases
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17866
this change is a follow up of ca7f7bf8e2, which changed the output path to build/$<CONFIG>/debian. but what dist/docker/debian/build_docker.sh expects is `build/dist/$config/debian/*.deb`, where `$config` is the normalized mode, when the debian packages are built using CMake generated rules, `$mode` is CMake configuration name, i.e., `$<CONFIG>`. so, ca7f7bf8e2 made a mistake, as it does not match the expectation of `build_docker.sh`.
in this change, this issue is addressed. so we use the same path in both `dist/CMakeLists.txt` and `dist/docker/debian/build_docker.sh`.
Closesscylladb/scylladb#17848
* github.com:scylladb/scylladb:
build: cmake: add dist-* targets to the default build target
build: cmake: put server deb packages under build/dist/$<CONFIG>/debian
This PR fixes a problem with replacing a node with tablets when
RF=N. Currently, this will fail because tablet replica allocation for
rebuild will not be able to find a viable destination, as the replacing node
is not considered to be a candidate. It cannot be a candidate because
replace rolls back on failure and we cannot roll back after tablets
were migrated.
The solution taken here is to not drain tablet replicas from replaced
node during topology request but leave it to happen later after the
replaced node is in left state and replacing node is in normal state.
The replacing node waits for this draining to be complete on boot
before the node is considered booted.
Fixes https://github.com/scylladb/scylladb/issues/17025
Nodes in the left state will be kept in tablet replica sets for a while after node
replace is done, until the new replica is rebuilt. So we need to know
about those node's location (dc, rack) for two reasons:
1) algorithms which work with replica sets filter nodes based on their location. For example materialized views code which pairs base replicas with view replicas filters by datacenter first.
2) tablet scheduler needs to identify each node's location in order to make decisions about new replica placement.
It's ok to not know the IP, and we don't keep it. Those nodes will not
be present in the IP-based replica sets, e.g. those returned by
get_natural_endpoints(), only in host_id-based replica
sets. storage_proxy request coordination is not affected.
Nodes in the left state are still not present in token ring, and not
considered to be members of the ring (datacanter endpoints excludes them).
In the future we could make the change even more transparent by only
loading locator::node* for those nodes and keeping node* in tablet replica sets.
Currently left nodes are never removed from topology, so will
accumulate in memory. We could garbage-collect them from topology
coordinator if a left node is absent in any replica set. That means we
need a new state - left_for_real.
Closesscylladb/scylladb#17388
* github.com:scylladb/scylladb:
test: py: Add test for view replica pairing after replace
raft, api: Add RESTful API to query current leader of a raft group
test: test_tablets_removenode: Verify replacing when there is no spare node
doc: topology-on-raft: Document replace behavior with tablets
tablets, raft topology: Rebuild tablets after replacing node is normal
tablets: load_balancer: Access node attributes via node struct
tablets: load_balancer: Extract ensure_node()
mv: Switch to using host_id-based replica set
effective_replication_map: Introduce host_id-based get_replicas()
raft topology: Keep nodes in the left state to topology
tablets: Introduce read_required_hosts()
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* test_data in two different tests
* row_cache_stress_test::reader_id
and drop its operator<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17861
Clang > 12 starts to complain like
```
warning: '-fuse-ld=' taking a path is deprecated; use '--ld-path=' instead [-Wfuse-ld-path]'
```
this option is not supported by GCC yet. also instead of using
the generic driver's name, use the specific name. otherwise ld
fails like
```
lld is a generic driver.
Invoke ld.lld (Unix), ld64.lld (macOS), lld-link (Windows), wasm-ld (WebAssembly) instead
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17825
For now test is incomplete in several ways
1. It xfails, until #17116
2. It doesn't rebuild/repair tablets
3. It doesn't check that tablet data actually exists on replicas
refs: #17575
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17808
When Mergify open a backport PR and identify conflicts, it adding the
`conflicts` label. Since GitHub can't identify conflicts in PR, setting
a role to move PR to draft, this way we will not trigger CI
Once we resolve the conflicts developer should make the PR `ready for
review` (which is not draft) and then CI will be triggered
`conflict` label can also be removed
Closesscylladb/scylladb#17834
Currently, when dividing memory tracked for a batch of updates
we do not take into account the overhead that we have for processing
every update. This patch adds the overhead for single updates
and joins the memory calculation path for batches and their parts
so that both use the same overhead.
Fixes#17854Closesscylladb/scylladb#17855
also, add a target of `dist-server`, which mirrors the structure
of the targets created by `configure.py`, and it is consistent
with the ones defined by `build_submodule()`.
so that they are built when our CI runs `ninja -C $build`. CI
expects that all these rpm and deb packages to built when
`ninja -C $build` finishes. so that it can continue with
building the container image. let's make it happen. so that
the CMake-based rules can work better with CI.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
It was observed that some use cases might append old data constantly to
memtable, blocking GC of expired tombstones.
That's because timestamp of memtable is unconditionally used for
calculating max purgeable, even when the memtable doesn't contain the
key of the tombstone we're trying to GC.
The idea is to treat memtable as we treat L0 sstables, i.e. it will
only prevent GC if it contains data that is possibly shadowed by the
expired tombstone (after checking for key presence and timestamp).
Memtable will usually have a small subset of keys in largest tier,
so after this change, a large fraction of keys containing expired
tombstones can be GCed when memtable contains old data.
Fixes#17599.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#17835
There is no point in checking `sst->filter_has_key(*hk)`
if the sstable contains no data older than the running
minimum timestamp, since even if it matches, it won't change
the minimum.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#17839
Fix writing cassandra-rackdc.properties with correct format data instead of yaml
Add a parameter to overwrite RF for specific DC
Add the possibility to connect cql to the specific node
In this PR 4 tests were added to test multi-DC functionality. One is added from initial commit were multi-DC possibility were introduced, however, this test was not commited. Three of them are migrations from dtest, that later will be deleted. To be able to execute migrated tests additional functionality is added: the ability to connect cql to the specific node in the cluster instead of pooled connection and the possibility to overwrite the replication factor for the specific DC. To be able to use the multi DC in test.py issue with the incorrect format of the properties file fixed in this PR.
Closesscylladb/scylladb#17503
With large schemas, unfreezing can stall, especially as it requires
a lot of memory. Switch to a gentle version that will not stall.
As a preparation step, we add unfreeze_gently() for a span of mutations.
Fixes#17841Closesscylladb/scylladb#17842
* github.com:scylladb/scylladb:
schema_tables: unfreeze frozen_mutation:s gently
frozen_mutation: add unfreeze_gently(span<frozen_mutation>)
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `perf_result_with_aio_writes`,
and drop its operator<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17849
Newly joining nodes may not have a host id yet. Handle this and print a
"?" for these nodes, instead of the host-id.
Extend the existing test for joining node case (also rename it and add
comment).
Closesscylladb/scylladb#17853
this change is a follow up of ca7f7bf8e2, which changed the output path
to build/$<CONFIG>/debian. but what dist/docker/debian/build_docker.sh
expects is `build/dist/$config/debian/*.deb`, where `$config` is the
normalized mode, when the debian packages are built using CMake
generated rules, `$mode` is CMake configuration name, i.e., `$<CONFIG>`.
so, ca7f7bf8e2 made a mistake, as it does not match the expectation of
`build_docker.sh`.
in this change, this issue is addressed. so we use the same path
in both `dist/CMakeLists.txt` and `dist/docker/debian/build_docker.sh`.
apply the same change to `dist-server-rpm`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
test/raft/replication.cc defines a symbol named `tlogger`, while
test/raft/randomized_nemesis_test.cc also defines a symbol with
the same name. when linking the test with mold, it identified the ODR
violation.
in this change, we extract test-raft-helper out, so that
randomized_nemesis_test can selectively only link against this library.
this also matches with the behavior of the rules generated by `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17836
in gossiping_property_file_snitch_test, we use
`BOOST_REQUIRE_EQUAL(dc_racks[i], dc_racks[0])` to check the equality
of two instances of `pair<sstring, sstring`, like:
```c++
BOOST_REQUIRE_EQUAL(dc_racks[i], dc_racks[0])
```
since the standard library does not provide the formatter for printing
`std::pair<>`, we rely on the homebrew generic formatter to
print `std::pair<>, which in turn uses operator<< to format the
elements in the `pair`, but we intend to remove this formatter
in future, as the last step of #13245 .
so in order to enable Boost.test to print out lhs and rhs when
`BOOST_REQUIRE_EQUAL` check fails, we are adding
`boost_test_print_type()` for `pair<sstring,sstring>`. the helper
function uses {fmt} to print the `pair<>`.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17831
this change prepares for the fmt::formatter based formatter used by
tests, which will use {fmt} to print the elements in a container,
so we need to define the formatter using fmt::formatter for these
element. the operator<< for service_level_options::workload_type is
preserved, as the tests are still using it.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17837
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
row_level_diff_detect_algorithm. please note, we already have
`format_as()` overload for this type, but we cannot use it as a
fallback of the proper `fmt::formatter<>` specialization before
{fmt} v10. so before we update our CI to a distro with {fmt} v10,
`fmt::formatter<row_level_diff_detect_algorithm>` is still
needed.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17824
The assert_that_failed(future) pair of helpers are templates with
variadic futures, but since they are gone in seastar, so should they in
test/lib
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17830
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `db::functions::function`.
please note, because we use `std::ostream` as the parameter of
the polymorphism implementation of `function::print()`.
without an intrusive change, we have to use `fmt::ostream_formatter`
or at least use similar technique to format the `function` instance
into an instance of `ostream` first. so instead of implementing
a "native" `fmt::formatter`, in this change, we just use
`fmt::ostream_formatter`.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17832
Empty histograms are missing some of the members that non-empty
histograms have. The code handling these histograms assumed all required
members are always present and thus error out when receiving an empty
histogram.
Add tests for empty histograms and fix the code handling them to check
for the potentially missing members, instead of making assumptions.
Closesscylladb/scylladb#17816