* rename `sync_labels.yaml` to `sync-labels.yaml`
* use more descrptive name for workflow
Closesscylladb/scylladb#17971
* github.com:scylladb/scylladb:
github: sync-labels: use more descriptive name for workflow
github: sync_labels: rename sync_labels to sync-labels
When a keyspace uses tablets, then effective ownership
can be obtained per table. If the user passes only a
keyspace, then /storage_service/ownership/{keyspace}
returns an error.
This change:
- adds an additional positional parameter to 'status'
command that allows a user to query status for table
in a keyspace
- makes usage of /storage_service/ownership/{keyspace}
optional to avoid errors when user tries to obtain
effective ownership of a keyspace that uses tablets
- implements new frontend tests in 'test_status.py'
that verify the new logic
Refs: scylladb#17405
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Closesscylladb/scylladb#17827
"label-sync" is not very helpful for developers to understand what
this workflow is for.
the "name" field of a job shows in the webpage on github of the
pull request against which the job is performed, so if the author
or reviewer checks the status of the pull request, he/she would
notice these names aside of the workflow's name. for this very
job, what we have now is:
```
Sync labels / label-sync
```
after this change it will be:
```
Sync labels / Synchronize labels between PR and the issue(s) fixed by it
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter. but fortunately, fmt v10 brings the builtin
formatter for classes derived from `std::exception`. but before
switching to {fmt} v10, and after dropping `FMT_DEPRECATED_OSTREAM`
macro, we need to print out `std::runtime_error`. so far, we don't
have a shared place for formatter for `std::runtime_error`. so we
are addressing the needs on a case-by-case basis.
in this change, we just print it using `e.what()`. it's behavior
is identical to what we have now.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17954
before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter` is added for following types for backward compatibility with {fmt} < 10:
* `utils::bad_exception_container_access`
* `cdc::no_generation_data_exception`
* classes derived from `sstables::malformed_sstable_exception`
* classes derived from `cassandra_exception`
Refs https://github.com/scylladb/scylladb/issues/13245Closesscylladb/scylladb#17944
* github.com:scylladb/scylladb:
cdc: add fmt::formatter for exception types in data_dictionary.hh
utils: add fmt::formatter for utils::bad_exception_container_access
sstables: add fmt::formatter for classes derived from sstables::malformed_sstable_exception
exceptions: add fmt::formatter for classes derived from cassandra_exception
cdc: add fmt::formatter for cdc::no_generation_data_exception
Fixes#16912
By default, ScyllaDB stores the maintenance socket in the workdir. Test.py by default uses the location for the ScyllaDB workdir as testlog/{mode}/scylla-#. The Usual location for cloning the repo is the user's home folder. In some cases, it can lead the socket path being too long and the test will start to fail. The simple way is to move the maintenance socket to /tmp folder to eliminate such a possibility.
Closesscylladb/scylladb#17941
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter<>` is added for following classes:
* `cql3::cql3_type`
* `cql3::cql3_type::raw`
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17945
There was two more things missing:
* Allow global options to be positioned before the operation/command option (https://github.com/scylladb/scylladb/issues/16695)
* Ignore JVM args (https://github.com/scylladb/scylladb/issues/16696)
This PR fixes both. With this, hopefully we are fully compatible with nodetool as far as command line parsing is concerned.
After this PR goes in, we will need another fix to tools/java/bin/nodetool-wrapper, to allow user to benefit from this fix. Namely, after this PR, we can just try to invoke scylla-nodetool first with all the command-line args as-is. If it returns with exit-code 100, we fall back to nodetool. We will not need the current trick with `--help $1`. In fact, this trick doesn't work currently, because `$1` is not guaranteed to be the command in the first place.
In addition to the above, this PR also introduces a new option, to help us in the switching process. This is `--rest-api-port`, which can also be provided as `-Dcom.scylladb.apiPort`. When provided, this option takes precedence over `--port|-p`. This is intended as a bridge for `scylla-ccm`, which currently provides the JMX port as `--port`. With this change, it can also provided the REST API port as `-Dcom.scylladb.apiPort`. The legacy nodetool will ignore this, while the native nodetool will use it to connect to the correct REST API address. After the switch we can ditch these options.
Fixes: https://github.com/scylladb/scylladb/issues/16695
Fixes: https://github.com/scylladb/scylladb/issues/16696
Refs: https://github.com/scylladb/scylladb/issues/16679
Refs: https://github.com/scylladb/scylladb/issues/15588Closesscylladb/scylladb#17168
* github.com:scylladb/scylladb:
tools/scylla-nodetool: add --rest-api-port option
tools/scylla-nodetool: ignore JVM args
tools/utils: make finding the operation command line option more flexible
tools/utils: get_selected_operation(): remove alias param
tools: add constant with current help command-line arguments
The recently added test_tablets_migration dominates with it run-time (10
minutes). Also update other tests, e.g. test_read_repair is not in top-7
for any mode, test_replace and test_raft_recovery_majority_loss are both
not notably slower than most of other tests (~40 sec both). On the other
hand, the test_raft_recovery_basic and test_group0_schema_versioning are
both 1+ minute
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17927
The initial version used a redundant method, and it did not cover all
cases. So that leads to the flakiness of the test that used this method.
Switching to the cluster_con() method removes flakiness since it's
written more robustly.
Fixesscylladb/scylladb#17914Closesscylladb/scylladb#17932
The group0 state machine calls `merge_topology_snapshot` from
`transfer_snapshot`. It feeds it with `raft_topology_snapshot` returned
from `raft_pull_topology_snapshot`. This snapshot includes the entire
`system.cdc_generations_v3` table. It can be huge and break the
commitlog `max_record_size` limit.
The `system.cdc_generations_v3` is a single-partition table, so all the
data is contained in one mutation object. To fit the commitlog limit we
split this mutation into many smaller ones and apply them in separate
`database::apply` calls. That means we give up the atomicity guarantee,
but we actually don't need it for `system.cdc_generations_v3` and
`system.topology_requests`.
This PR fixes the dtest
`update_cluster_layout_tests.py::TestLargeScaleCluster::test_add_many_nodes_under_load`
Fixesscylladb/scylladb#17545Closesscylladb/scylladb#17632
* github.com:scylladb/scylladb:
test_cdc_generation_data: test snapshot transfer
storage_service::merge_topology_snapshot: handle big cdc_generations_v3 mutations
mutation: add split_mutation function
storage_service::merge_topology_snapshot: fix indentation
sstables::test_env is intended for sstable unit tests, but to satisfy its
dependency of an sstables_registry we instantiate an entire database.
Remove the dependency by having a mock implementation of sstables_registry
and using that instead.
Closesscylladb/scylladb#17895
If there is a bug in the tablet scheduler which makes it never
converge for a given state of topology, rebalance_tablets() will never
complete and will generate a huge amounts of logs. This patch adds a
sanity limit so that we fail earlier.
This was observed in one of the test_load_balancing_with_random_load runs in CI.
Fixesscylladb/scylladb#17894.
Closesscylladb/scylladb#17916
The series marks nodes to be non expiring in the address map earlier, when
they are placed in the topology.
Fixes: scylladb/scylladb#16849
* 'gleb/16849-fix-v2' of github.com:scylladb/scylla-dev:
test: add test to check that address cannot expire between join request placemen and its processing
topology_coordinator: set address map entry to nonexpiring when a node is added to the topology
raft_group0: add modifiable_address_map() function
There is no need to map this node's inet_address to host_id. The
storage_service can easily just pass the local host_id. While at it, get
the other node's host_id directly from their endpoint_state instead of
looking it up yet again in the gossiper, using the nodes' address.
Refs #12283Closesscylladb/scylladb#17919
* github.com:scylladb/scylladb:
cdc: should_propose_first_generation: get my_host_id from caller
storage_service: add my_host_id
before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for
* raft_call
* raft_read
* network_majority_grudge
* reconfiguration
* stop_crash
* operation::thread_id
* append_seq
* AppendReg::append
* AppendReg::ret
* operation::either_of<Ops...>
* operation::exceptional_result<Op>
* operation::completion<Op>
* operation::invocable<Op>
and drop their operator<<:s.
in which,
* `operator<<` for append_entry is never used. so it is removed.
* `operator<<` for `std::monostate` and `std::variant` are dropped. as we are now using their counterparts in {fmt}.
* stop_crash::result_type 's `fmt::formatter` is not added, as we cannot define a partial specialization of `fmt::formatter` for a nested class for a template class. we will tackle this struct in another change.
Refs #13245Closesscylladb/scylladb#17884
* github.com:scylladb/scylladb:
test: raft: generator: add fmt::formatter:s
test: randomized_nemesis_test: add fmt::formatter for some types
test: randomized_nemesis_test: add fmt::formatter for seastar::timed_out_error
raft: add fmt::formatter for error classes
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter<>` is added for following classes for
backward compatibility with {fmt} < 10:
* `data_dictionary::no_such_keyspace`
* `data_dictionary::no_such_column_family`
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter<utils::bad_exception_container_access>` is
added for backward compatibility with {fmt} < 10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter<T>` is added for classes derived from
`malformed_sstable_exception`, where `T` is the class type derived from
`malformed_sstable_exception`.
this change is implemented to be backward compatible with {fmt} < 10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter<T>` is added for classes derived from
`cassandra_exception`, where `T` is the class type derived from
`cassandra_exception`.
this change is implemented to be backward compatible with {fmt} < 10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, `fmt::formatter<cdc::no_generation_data_exception>` is
added for backward compatibility with {fmt} < 10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The test only looked at the initial cdc_generation
generation. It made the changes bigger to go
past the raft max_command_size limit.
It then made sure this large mutation set is saved
in several raft commands.
In this commit we enhance the test to check that the
mutations are properly handled during snapshot transfer.
The problem is that the entire system.cdc_generations_v3
table is read into the topology_snapshot and it's total
size can exceed the commitlog max_record_size limit.
We need a separate injection since the compaction
could nullify the effects of the previous injection.
The test fails without the fix from the previous commit.
The group0 state machine calls merge_topology_snapshot
from transfer_snapshot. It feeds it with raft_topology_snapshot
returned from raft_pull_topology_snapshot. This snapshot
includes the entire system.cdc_generations_v3 table.
It can be huge and break the commitlog max_record_size limit.
The system.cdc_generations_v3 is a single-partition table,
so all the data is contained in one mutation object. To
fit the commitlog limit we split this mutation into several
smaller ones and apply them in separate database::apply calls.
That means we give up the atomicity guarantee, but we
actually don't need it for system.cdc_generations_v3.
The cdc_generations_v3 data is not used in any way until
it's referenced from the topology table. By applying the
cdc_generations_v3 mutations before topology mutations
we ensure that the lack of atomicity isn't a problem here.
The database::apply method takes frozen_mutation parameter by
const reference, so we need to keep them alive until
all the futures are complete.
fixes#17545
The function splits the source mutation into multiple
mutations so that their size does not exceed the
max_size limit. The size of a mutation is calculated
as the sum of the memory_usage() of its constituent
mutation_fragments.
The implementation is taken from view_updating_consumer.
We use mutation_rebuilder_v2 to reconstruct mutations from
a stream of mutation fragments and recreate the output
mutation whenever we reach the limit.
We'll need this function in the next commit.
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* operation::either_of<Ops...>
* operation::exceptional_result<Op>
* operation::completion<Op>
* operation::invocable<Op>
and drop their operator<<:s.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* raft_call
* raft_read
* network_majority_grudge
* reconfiguration
* stop_crash
* operation::thread_id
* append_seq
* append_entry
* AppendReg::append
* AppendReg::ret
and drop their operator<<:s.
in which,
* `operator<<` for `std::monostate` and `std::variant` are dropped.
as we are now using their counterparts in {fmt}.
* stop_crash::result_type 's `fmt::formatter` is not added, as we
cannot define a partial specialization of `fmt::formatter` for
a nested class for a template class. we will tackle this struct
in another change.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatter for `seastar::timed_out_error`,
which will be used by the `fmt::formatter` for `std::variant<...>`.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatter for classes derived from
`raft::error`. since {fmt} v10 defines the formatter for all classes
derived from `std::exception`, the definition is provided only when
the tree is compiled with {fmt} < 10.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The token ring table is a virtual table (`system.token_ring`), which contains the ring information for all keyspaces in the system. This is essentially an alternative to `nodetool describering`, but since it is a virtual table, it allows for all the usual filtering/aggregation/etc. that CQL supports.
Up until now, this table only supported keyspaces which use vnodes. This PR adds support for tablet keyspaces. To accommodate these keyspaces a new `table_name` column is added, which is set to `ALL` for vnodes keyspaces. For tablet keyspaces, this contains the name of the table.
Simple sanity tests are added for this virtual table (it had none).
Fixes: #16850Closesscylladb/scylladb#17351
* github.com:scylladb/scylladb:
test/cql-pytest: test_virtual_tables: add test for token_ring table
db/virtual_tables: token_ring_table: add tablet support
db/virtual_tables: token_ring_table: add table_name column
db/virtual_tables: token_ring_table: extract ring emit
service/storage_service: describe_ring_for_table(): use topology to map hostid to ip
There is no need to map this node's inet_address to host_id.
The storage_service can easily just pass the local host_id.
While at it, get the other node's host_id directly
from their endpoint_state instead of looking it up
yet again in the gossiper, using the nodes' address.
Refs #12283
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Shorthand for getting this node's host_id
from token_metadata.topology, similar to the
`get_broadcast_address` helper.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
In topology on raft, management of CDC generations is moved to the topology coordinator.
We need to verify that the CDC keeps working correctly during the upgrade for topology on the raft.
A similar change will be made in the topology recovery test. It will reuse
the `start_writes_to_cdc_table` function.
Ref #17409Closesscylladb/scylladb#17828
This PR contains few fixes and improvment seen during
https://github.com/scylladb/scylladb/issues/15902 label addtion
When we add a label to an issue, we go through all PR.
1) Setting PR base to `master` (release PR are not relevant)
2) Since for each Issue we have only one PR, ending the search after a
match was found
3) Make sure to skip PR with empty body (mainly debug one)
4) Set backport label prefix to `backport/`
Closesscylladb/scylladb#17912
Introduces relative link support for individual properties listed on the configuration properties page. For instance, to link to a property from a different document, use the syntax :ref:`memtable_flush_static_shares <confprop_memtable_flush_static_shares>`.
Additionally, it also adds support for linking groups. For example, :ref:`Ungrouped properties <confgroup_ungrouped_properties>`.
Closesscylladb/scylladb#17753
> Revert "build: do not provide zlib as an ingredient"
> Fix reference to sstring type in tutorial about concurrency in coroutines
> Merge 'Adding a Metrics tester app' from Amnon Heiman
> cooking.sh: do not quote backtick in here document
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17887
Affects load-and-stream for tablets only.
The intention is that only this loop is responsible for detecting
exhausted sstables and then discarding them for next iterations:
while (sstable_it != _sstables.rend() && exhausted(*sstable_it)) {
sstable_it++;
}
But the loop which consumes non exhausted sstables, on behalf of
each tablet, was incorrectly advancing the iterator, despite the
sstable wasn't considered exhausted.
Fixes#17733.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#17899
When we open a backport PR we should make sure the patch contains a ref to the issue it suppose to fix in order to make sure we have more accurate backport information
This action will only be triggered when base branch is `branch-*`
If `Fixes` are missing, this action will fail and notify the author.
Ref: https://github.com/scylladb/scylla-pkg/issues/3539Closesscylladb/scylladb#17897
The test dtest materialized_views_test.py::TestMaterializedViews::
test_mv_populating_from_existing_data_during_truncate reproduces an
assertion failure, and crash, while doing a CREATE MATERIALIZED VIEW
during a TRUNCATE operation.
This patch fixes the crash by removing the assert() call for a view
(replacing it by a warning message) - we'll explain below why this is fine.
Also for base tables change we change the assertion to an on_internal_error
(Refs #7871).
This makes the test stop crashing Scylla, but it still fails due to
issue #17635.
Let's explain the crash, and the fix:
The test starts TRUNCATE on table that doesn't yet have a view.
truncate_table_on_all_shards() begins by disabling compaction on
the table and all its views (of which there are none, at this
point). At this point, the test creates a new view is on this table.
The new view has, by default, compaction enabled. Later, TRUNCATE
calls discard_sstables() on this new view, asserts that it has
compaction disabled - and this assertion fails.
The fix in this patch is to not do the assert() for views. In other words,
we acknowledge that in this use case, the view *will* have compactions
enabled while being truncated. I claim that this is "good enough", if we
remember *why* we disable compaction in the first place: It's important
to disable compaction while truncating because truncating during compaction
can lead us to data resurection when the old sstable is deleted during
truncation but the result of the compaction is written back. True,
this can now happen in a new view (a view created *DURING* the
truncation). But I claim that worse things can happen for this
new view: Notably, we may truncate a view and then the ongoing
view building (which happens in a new view) might copy data from
the base to the view and only then truncate the base - ending up
with an empty base and non-empty view. This problem - issue #17635 -
is more likely, and more serious, than the compaction problem, so
will need to be solved in a separate patch.
Fixes#17543.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#17634
When a sstable set is cloned, we don't want a change in cloned set
propagating to the former one.
It happens today with partitioned_sstable_set::_all_runs, because
sets are sharing ownership of runs, which is wrong.
Let's not violate clone semantics by copying all_runs when cloning.
Doesn't affect data correctness as readers work directly with
sstables, which are properly cloned. Can result in a crash in ICS
when it is estimating pending tasks, but should be very rare in
practice.
Fixes#17878.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#17879
This option is an alternative to --port|-p and takes precedence over it.
This is meant to aid the switch from the legacy nodetool to the native
one. Users of the legacy nodetool pass the port of JMX to --port. We
need a way to provide both the JMX port (via --port) and also the REST
API port, which only the native nodetool will interpret. So we add this
new --rest-api-port, which when provided, overwrites the --port|-p
option. To ensure the legacy nodeotol doesn't try to interpret this,
this option can also be provided as -Dcom.scylladb.apiPort (which is
substituted to --rest-api-port behind the scenes).