This change is intended to remove the dependency to
operator<<(std::ostream&, const std::unordered_set<T>&)
from auth_resource_test.cc.
It prepares the test for removal of the templated helpers
from utils/to_string.hh, which is one of goals of the
referenced issue that is linked below.
Refs: #13245
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Closesscylladb/scylladb#16754
this is to mimic the formatting of `human_readable_value`, and to prepare for consolidating these two formatters, so we don't have two pretty printers in the tree.
Closesscylladb/scylladb#16726
* github.com:scylladb/scylladb:
utils/pretty_printers: add "I" specifier support
utils/pretty_printers: use the formatting of to_hr_size()
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.
Fixes#16180
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#16658
this is to mimic the formatting of `human_readable_value`, and
to prepare for consolidating these two formatters, so we don't have
two pretty printers in the tree.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
keep the precision of 4 digits, for instance, so that we format
"8191" as "8191" instead of as "8 Ki". this is modeled after
the behavior of `to_hr_size()`. for better user experience.
and also prepares to consolidate these two formatters.
tests are updated to exercise both IEC and SI notations.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this target is used by test.py for enumerating unit tests
* test/CMakeLists.txt: append executable's full path to
`scylla_tests`. add `unit_test_list` target printing
`scylla_tests`, please note, `cmake -E echo` does not
support the `-e` option of `echo`, and ninja does not
support command line with newline in it, we have to use
`echo` to print the list of tests.
* test/{boost,raft,unit}/CMakeLists.txt: set scylla_tests
only if $PWD/suite.yaml exists. we could hardwire this
logic in these files, as it is known that this file
exists in these directory, but this is still put this way,
so that it serves as a comment explaining that the reason
why we update scylla_tests here but not somewhere else
where we also use `add_scylla_test()` function is just
suite.yaml exists here.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16702
In 7d5e22b43b ("replica: memtable: don't forget memtable
memory allocation statistics") we taught memtable_list to remember
learned memory allocation reserves so a new memtable inherits these
statistics from an older memtable. Share it now further across tablets
that belong to the same table as well. This helps the statistics be more
accurate for tablets that are migrated in, as they can share existing
tablet's memory allocation history.
Closesscylladb/scylladb#16571
* github.com:scylladb/scylladb:
table, memtable: share log-structured allocator statistics across all memtables in a table
memtable: consolidate _read_section, _allocating_section in a struct
Previously, the tablet information was sent to the drivers
in two pieces within the custom_payload. We had information
about the replicas under the `tablet_replicas` key and token range
information under `token_range`. These names were quite generic
and might have caused problems for other custom_payload users.
Additionally, dividing the information into two pieces raised
the question of what to do if one key is present while the other
is missing.
This commit changes the serialization mechanism to pack all information
under one specific name, `tablets-routing-v1`.
From: Sylwia Szunejko <sylwia.szunejko@scylladb.com>
Closesscylladb/scylladb#16148
So that a single centrally managed db::config instance can be shared by
all code requiring it, instead of creating local instances where needed.
This is required to load schema from encrypted schema-tables, and it
also helps memory consumption a bit (db::config consumes a lot of
memory).
Fixes: #16480Closesscylladb/scylladb#16495
Those two members are passed from memtable_list to memtable. Since we
wish to pass them from table, it becomes awkward to pass them as two
separate variables as their contents are specific to memtable internals.
Wrap them in a name that indicates their role (being table-wide shared
data for memtables) and pass them as a unit.
The option is kepd in DDL, but is _not_ stored in
system_schema.keyspaces. Instead, it's removed from the provided options
and kept in scylla_keyspaces table in its own column. All the places
that had optional initial_tablets disengaged now set this value up the
way the find appropriate.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now all the callers have it at hands (spoiler: not yet initialized, but
still) so the params can also have it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Previous patch added params to r.s. classes' constructors, but callers
don't construct those directly, instead they use the create_r.s.()
wrapper. This patch adds params to the wrapper too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When replication strategy class is created caller parr const reference
on the config options which is, in turn, a map<string, string>. In the
future r.s. classes will need to get "scylla specific" info along with
legacy options and this patch prepares for that by passing more generic
params argument into constructor. Currently the only inhabitant of the
new params is the legacy options.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Observer, that references table_for_test, must of course, not
outlive table_for_test. Observer can be called later after the
last input sstable is removed from sstable manager.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#16428
The interface is fragile because the user may incorrectly use the
wrong "gc before". Given that sstable knows how to properly calculate
"gc before", let's do it in estimate__d__t__r(), leaving no room
for mistakes.
sstable_run's variant was also changed to conform to new interface,
allowing ICS to properly estimate droppable ratio, using GC before
that is calculated using each sstable's range. That's important for
upcoming tablets, as we want to query only the range that belongs
to a particular tablet in the repair history table.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#15931
Fixes#16312
This test replays a segment before it might be closed or even fully flushed, thus it can (with the new semantics) generate a segment_truncation exception if hitting eof earlier than expected. (Note: test does not use pre-allocated segments).
(First patch makes the test coroutinized to make for a nicer, easier fix change.
Closesscylladb/scylladb#16368
* github.com:scylladb/scylladb:
commitlog_test::test_commitlog_reader: handle segment_truncation
commitlog_test: coroutinize test_commitlog_reader
We make `consistent_cluster_management` mandatory in 5.5. This
option will be always unused and assumed to be true.
Additionally, we make `override_decommission` deprecated, as this option
has been supported only with `consistent_cluster_management=false`.
Making `consistent_cluster_management` mandatory also simplifies
the code. Branches that execute only with
`consistent_cluster_management` disabled are removed.
We also update documentation by removing information irrelevant in 5.5.
Fixesscylladb/scylladb#15854
Note about upgrades: this PR does not introduce any more limitations
to the upgrade procedure than there are already. As in
scylladb/scylladb#16254, we can upgrade from the first version of Scylla
that supports the schema commitlog feature, i.e. from 5.1 (or
corresponding Enterprise release) or later. Assuming this PR ends up in
5.5, the documented upgrade support is from 5.4. For corresponding
Enterprise release, it's from 2023.x (based on 5.2), so all requirements
are met.
Closesscylladb/scylladb#16334
* github.com:scylladb/scylladb:
docs: update after making consistent_cluster_management mandatory
system_keyspace, main, cql_test_env: fix indendations
db: config: make consistent_cluster_management mandatory
test: boost: schema_change_test: replace disable_raft_schema_config
db: config: make override_decommission deprecated
db: config: make force_schema_commit_log deprecated
The test case that validates upload-sink works does this by getting several random ranges from the uploaded object and checks that the content is what it should be. The range boundaries are generated like this:
```
uint64_t len = random(1, chunk_size);
uint64_t offset = random(file_size) - len;
```
The 2nd line is not correct, if random number happens less than the len the offset befomes "negative", i.e. -- very large 64-bit unsigned value.
Next, this offset:len gets into s3 client's get_object_contiguous() helper which in turn converts them into http range header's bytes-specifier format which is "first_bytet-last_byte" one. The math here is
```
first_byte = offset;
last_byte = offset + len - 1;
```
Here the overflow of the offset thing results in underflow of the last_byte -- it becomes less than the first_byte. According to RFC this range-specifier is invalid and (!) can be ignored by the server. This is what minio does -- it ignores invalid range and returns back full object.
But that's not all. When returning object portion the http request status code is PartialContent, but when the range is ignored and full object is returned, the status is OK. This makes s3 client's request fail with unexpected_status_error in the middle of the test. Then the object is removed with deferred action and actual error is printed into logs. In the end of the day logs look as if deletion of an object failed with OK status %)
fixes: #16133Closesscylladb/scylladb#16324
* github.com:scylladb/scylladb:
test/s3: Avoid object range overflow
s3/client: Handle GET-with-Range overflows correctly
This introduces the ability to split a storage group.
The main compaction group is split into left and right groups.
set_split() is used to set the storage group to splitting mode, which
will create left and right compaction groups. Incoming writes will
now be placed into memtable of either left or right groups.
split() is used to complete the splitting of a group. It only
returns when all preexisting data is split. That means main
compaction group will be empty and all the data will be stored
in either left or right group.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The task for splitting compaction will run until all sstables
in the main set are split. The only exceptions are shutdown
or user has explicitly asked for abort.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Token group is an abstraction that allows us to easily segregate a
mutation stream into buckets. Groups share the same properties as
compaction groups. Groups follow the ring order and they don't
overlap each other. Groups are defined according to a classifier,
which return an id given a token. It's expected that classifier
return ids in monotonic increasing order.
The reasons for this abstraction are:
1) we don't want to make segregator aware of compaction groups
2) splitting happens before tablet metadata is changed, so the
the segregator will have to classify based on whether the token
belongs to left (group id 0) or right (group id 1) side of
the range to be split.
The reason for not extending sstable writer instead, is that
today, writer consumer can only tell producer to switch to a
new writer, when consuming the end of a partition, but that
would be too late for us, as we have to decide to move to
a new writer at partition start instead.
It will be wired into compaction when it happens in split mode.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Reduce code duplication by defining each metric just once, instead of three times, by having the semaphore register metrics by itself. This also makes the lifecycle of metrics contained in that of the semaphore. This is important on enterprise where semaphores are added and removed, together with service levels.
We don't want all semaphores to export metrics, so a new parameter is introduced and all call-sites make a call whether they opt-in or not.
Fixes: https://github.com/scylladb/scylladb/issues/16402Closesscylladb/scylladb#16383
* github.com:scylladb/scylladb:
database, reader_concurrency_sempaphore: deduplicate reader_concurrency_sempaphore metrics
reader_concurrency_semaphore: add register_metrics constructor parameter
sstables: name sstables_manager
Code that executed only when consistent_cluster_management=false is
removed. In particular, after this patch:
- raft_group0 and raft_group_registry are always enabled,
- raft_group0::status_for_monitoring::disabled becomes unused,
- topology tests can only run with consistent_cluster_management.
In the following commits, we make consistent cluster management
mandatory. This will make disable_raft_schema_config unusable,
so we need to get rid of it. However, we don't want to remove
tests that use it.
The idea is to use the Raft RECOVERY mode instead of disabling
consistent cluster management directly.
In this PR we refactor `token_metadata` to use `locator::host_id` instead of `gms::inet_address` for node identification in its internal data structures. Main motivation for these changes is to make raft state machine deterministic. The use of IPs is a problem since they are distributed through gossiper and can't be used reliably. One specific scenario is outlined [in this comment](https://github.com/scylladb/scylladb/pull/13655#issuecomment-1521389804) - `storage_service::topology_state_load` can't resolve host_id to IP when we are applying old raft log entries, containing host_id-s of the long-gone nodes.
The refactoring is structured as follows:
* Turn `token_metadata` into a template so that it can be used with host_id or inet_address as the node key. The version with inet_address (the current one) provides a `get_new()` method, which can be used to access the new version.
* Go over all places which write to the old version and make the corresponding writes to the new version through `get_new()`. When this stage is finished we can use any version of the `token_metadata` for reading.
* Go over all the places which read `token_metadata` and switch them to the new version.
* Make `host_id`-based `token_metadata` default, drop `inet_address`-based version, change `token_metadata` back to non-template.
These series [depends](1745a1551a) on RPC sender `host_id` being present in RPC `clent_info` for `bootstrap` and `replace` node_ops commands. This feature was added in [this commit](95c726a8df) and released in `5.4`. It is generally recommended not to skip versions when upgrading, so users who upgrade sequentially first to `5.4` (or the corresponding Enterprise version) then to the version with these changes (`5.5` or `6.0`) should be fine. If for some reason they upgrade from a version without `host_id` in RPC `clent_info` to the version with these changes and they run bootstrap or replace commands during the upgrade procedure itself, these commands may fail with an error `Coordinator host_id not found` if some nodes are already upgraded and the node which started the node_ops command is not yet upgraded. In this case the user can finish the upgrade first to version 5.4 or later, or start bootstrap/replace with an upgraded node. Note that removenode and decommission do not depend on coordinator host_id so they can be started in the middle of upgrade from any node.
Closesscylladb/scylladb#15903
* github.com:scylladb/scylladb:
topology: remove_endpoint: remove inet_address overload
token_metadata: topology: cleanup add_or_update_endpoint
token_metadata: add_replacing_endpoint: forbid replacing node with itself
topology: drop key_kind, host_id is now the primary key
dc_rack_fn: make it non-template
token_metadata: drop the template
shared_token_metadata: switch to the new token_metadata
gossiper: use new token_metadata
database: get_token_metadata -> new token_metadata
erm: switch to the new token_metadata
storage_service: get_token_metadata -> token_metadata2
storage_service: get_token_to_endpoint_map: use new token_metadata
api/token_metadata: switch to new version
storage_service::on_change: switch to new token_metadata
cdc: switch to token_metadata2
calculate_natural_endpoints: fix indentation
calculate_natural_endpoints: switch to token_metadata2
storage_service: get_changed_ranges_for_leaving: use new token_metadata
decommission_with_repair, removenode_with_repair -> new token_metadata
rebuild_with_repair, replace_with_repair: use new token_metadata
bootstrap: use new token_metadata
tablets: switch to token_metadata2
calculate_effective_replication_map: use new token_metadata
calculate_natural_endpoints: fix formatting
abstract_replication_strategy: calculate_natural_endpoints: make it work with both versions of token_metadata
network_topology_strategy_test: update new token_metadata
storage_service: on_alive: update new token_metadata
storage_service: handle_state_bootstrap: update new token_metadata
storage_service: snitch_reconfigured: update new token_metadata
storage_service: leave_ring: update new token_metadata
storage_service: node_ops_cmd_handler: update new token_metadata
storage_service: node_ops_cmd_handler: add coordinator_host_id
storage_service: bootstrap: update new token_metadata
storage_service: join_token_ring: update new token_metadata
storage_service: excise: update new token_metadata
storage_service: join_cluster: update new token_metadata
storage_service: on_remove: update new token_metadata
storage_service: handle_state_normal: fill new token_metadata
storage_service: topology_state_load: fill new token_metadata
storage_service: adjust update_topology_change_info to update new token_metadata
topology: set self host_id on the new topology
locator::topology: allow being_replaced and replacing nodes to have the same IP
token_metadata: get_endpoint_for_host_id -> get_endpoint_for_host_id_if_known
token_metadata: get_host_id: exception -> on_internal_error
token_metadata: add get_all_ips method
token_metadata: support host_id-based version
token_metadata: make it a template with NodeId=inet_address/host_id NodeId is used in all internal token_metadata data structures, that previously used inet_address. We choose topology::key_kind based on the value of the template parameter.
locator: make dc_rack_fn a template
locator/topology: add key_kind parameter
token_metadata: topology_change_info: change field types to token_metadata_ptr
token_metadata: drop unused method get_endpoint_to_token_map_for_reading
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
Make host_id parameter non-optional and
move it to the beginning of the arguments list.
Delete unused overloads of add_or_update_endpoint.
Delete unused overload of token_metadata::update_topology
with inet_address argument.
In this commit we replace token_metadata with token_metadata2
in the erm interface and field types. To accommodate the change
some of strategy-related methods are also updated.
All the boost and topology tests pass with this change.
locator_topology_test, network_topology_strategy_test and
tablets_test are fully switched to the host_id-based token_metadata,
meaning they no longer populate the old token_metadata.
All the boost and topology tests pass with this change.
In this commit we switch the function
calculate_effective_replication_map to use the new
token_metadata. We do this by employing our new helper
calculate_natural_ips function. We can't use this helper for
current_endpoints/target_endpoints though,
since in that case we won't add the IP to the
pending_endpoints in the replace-with-same-ip scenario
The token_metadata_test is migrated to host_ids in the same
commit to make it pass. Other tests work because they fill
both versions of the token_metadata, but for this test it was
simpler to just migrate it straight away. The test constructs
the old token_metadata over the new token_metadata,
this means only the get_new() method will work on it. That's
why we also need to switch some other functions
(maybe_remove_node_being_replaced, do_get_natural_endpoints,
get_replication_factor) to the new version in the same commit.
All the boost and topology tests pass with this change.
We've updated all the places where token_metadata
is mutated, and now we can progress to the next stage
of the refactoring - gradually switching the read
code paths.
The calculate_natural_endpoints function
is at the core of all of them. It decides to what nodes
the given token should be replicated to for the given
token_metadata. It has a lot of usages in various contexts,
we can't switch them all in one commit, so instead we
allowed the function to behave in both ways. If
use_host_id parameter is false, the function uses the provided
token_metadata as is and returns endpoint_set as a result.
If it's true, it uses get_new() on the provided token_metadata
and returns host_id_set as a result.
The scope of the whole refactoring is limited to the erm data
structure, its interface will be kept inet_address based for now.
This means we'll often need to resolve host_ids to inet_address-es
as soon as we got a result from calculated_natural_endpoints.
A new calculate_natural_ips function is added for convenience.
It uses the new token_metadata and immediately resolves
returned host_id-s to inet_address-es.
The auxiliary declarations natural_ep_type, set_type, vector_type,
get_self_id, select_tm are introduced only for the sake of
migration, they will be removed later.
Fixes#16312
This test replays a segment before it might be closed or even fully flushed,
thus it can (with the new semantics) generate a segment_truncation exception
if hitting eof earlier than expected. (Note: test does not use pre-allocated
segments).
When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620).
If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957).
When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary.
We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`.
Fixes: #7620Fixes: #13957
---
This is a reincarnation of PR scylladb/scylladb#15331. The previous PR was reverted due to a bug it unmasked; the bug has now been fixed (scylladb/scylladb#16139). Some refactors from the previous PR were already merged separately, so this one is a bit smaller.
I have checked with @Lorak-mmk's reproducer (https://github.com/Lorak-mmk/udt_schema_change_reproducer -- many thanks for it!) that the originally exposed bug is no longer reproducing on this PR, and that it can still be reproduced if I revert the aforementioned fix on top of this PR.
Closesscylladb/scylladb#16242
* github.com:scylladb/scylladb:
docs: describe group 0 schema versioning in raft docs
test: add test for group 0 schema versioning
feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode
schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0
migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations
schema_tables: use schema version from group 0 if present
migration_manager: store `group0_schema_version` in `scylla_local` during schema changes
system_keyspace: make `get/set_scylla_local_param` public
feature_service: add `GROUP0_SCHEMA_VERSIONING` feature
This commit fixes an inconsistency in method names:
get_host_id and get_host_id_if_known are
(internal_error, returns null), but there was only
one method for the opposite conversion - get_endpoint_for_host_id,
and it returns null. In this commit we change it to on_internal_error
if it can't find the argument and add another method
get_endpoint_for_host_id_if_known which returns null in this case.
We can't use get_endpoint_for_host_id/get_host_id
in host_id_or_endpoint::resolve since it's called
from storage_service::parse_node_list
-> token_metadata::parse_host_id_and_endpoint,
and exceptions are caught and handled in
`storage_service::parse_node_list`.
In the next commits token_metadata will be
made a template with NodeId=inet_address|host_id
parameter. This parameter will be passed to dc_rack_fn
function, so it also should be made a template.