before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
* column_definition
* column_mapping
* ordinal_column_id
* raw_view_info
* schema
* view_ptr
their operator<<:s are dropped. but operator<< for schema is preserved,
as we are still printing `seastar::lw_shared_ptr<const schema>` with
our homebrew generic formatter for `seastar::lw_shared_ptr<>`, which
uses operator<< to print the pointee.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17768
codespell reports "Nees" should be "Needs" but "Nees" is the last
name of Georg Nees. so it is not a misspelling. can should not be
fixed.
since the purpose of lolwut.cc is to display Redis version and
print a generative computer art. the one included by our version
was created by Georg Nees. since the LOLWUT command does not contain
business logic connected with scylladb, we don't lose a lot if skip
it when scanning for spelling errors. so, in this change, let's
skip it, this should silence one more warning from the github
codespell workflow.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17770
downloads.scylladb.com recently started redirecting from http to https
(via `301 Moved Permanently`).
This broke package downloading in open-coredump.sh.
To fix this, we have to instruct curl to follow redirects.
Closesscylladb/scylladb#17759
This patch adds the dc option support for table repair. The management
tool can use this option to select nodes in specific data centers to run
repair.
Fixes: #17550
Tests: repair_additional_test.py::TestRepairAdditional::test_repair_option_dc
Closesscylladb/scylladb#17571
Calling scylla-nodetool with option describering and ommiting the keyspace
name argument results in a boost exception with the following error message:
error running operation: boost::wrapexcept<boost::bad_any_cast> (boost::bad_any_cast: failed conversion using boost::any_cast)
This change checks for the missing keyspace and outputs a more sensible
error message:
error processing arguments: keyspace must be specified
Closesscylladb/scylladb#17741
Just a cleanup -- replace do_with_cql_env + async with do_with_cql_env_thread
Closesscylladb/scylladb#17758
* github.com:scylladb/scylladb:
test/storage_proxy: Restore indentation after previous patch
test/storage_proxy: Use do_with_cql_env_thread()
One of the test cases explicitly wraps itself into async, but there's a
convenience helper for that already.
Indentation is deliberately left broken
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Builder works in "steps". Each step runs for a given base table, when a
new view is created it either initiates a step or appends to currently
running step.
Running a step means reading mutations from local sstables reader and
applying them to all views that has jumped into this step so far. When a
view is added to the step it remembers the current token value the step
is on. When step receives end-of-stream it rewinds to minimal-token.
Rewinding is done by closing current reader and creating a new one. Each
time token is advanced, all the views that meet the new token value for
the second time (i.e. -- scan full round) are marked as built and are
removed from step. When no views are left on step, it finishes.
The above machinery can break when rewinding the end-of-stream reader.
The trick is that a running step silently assumes that if the reader
once produced some token (and there can be a view that remembered this
token as its starting one), then after rewinding the reader would
generate the same token or greater. With tablets, however, that's not
the case. When a node is decommissioned tablets are cleaned and all
sstables are removed. Rewinding a reader after it makes empty reader
that produces no tokens from now on. Respectively, any build steps that
had captured tokens prior to cleanup would get stuck forever.
The fix is to check if the mutation consumer stepped at least one step
forward after rewind, and if no -- complete all the attached views.
fixes: #17293
Similar thing should happen if the base table is truncated with views
being built from it. Testing it steps on compaction assertion elsewhere
and needs more research.
refs: #17543
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17548
summarize_tests() is only used to summarize boost tests, so reflect
this fact using its name. we will need to summarize the tests which
generate JUnit XML as well, so this change also prepares for a
following-up change to implement a new summarize helper.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17746
Here are three endpoints in the api/cache_service that report "metrics"
for the row cache and the values they return
- entries: number of partitions
- size: number of partitions
- capacity: used space
The size and capacity seem very inaccurate.
Comment says, that in C* the size should be weighted, but scylla doesn't
support weight of entries in cache. Also, capacity is configurable via
row_cache_size_in_mb config option or set_row_cache_capacity_in_mb API
call, but Scylla doesn't support both either.
This patch suggestes changing return values for size and capacity endpoints.
Despite row cache doesn't support weights, it's natural to return
used_space in bytes as the value, which is more accurate to what "size"
means rather than number of entries.
The capacity may return back total memory size, because this is what
Scylla really does -- row cache growth is only limited by other memory
consumers, not by configured limits.
fixes: #9418
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17724
The test carries const std::string_view& around, but the type is
lightweight class that can be copied around at the same cost as its
reference.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17735
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `view_info`, its operator<<
is dropped.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17745
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for
* utils::human_readable_value
* std::strong_ordering
* std::weak_ordering
* std::partial_ordering
* utils::exception_container
Refs https://github.com/scylladb/scylladb/issues/13245Closesscylladb/scylladb#17710
* github.com:scylladb/scylladb:
utils/exception_container: add fmt::formatter for exception_container
utils/human_readable: add fmt::formatter for human_readable_value
utils: add fmt::formatter for std::strong_ordering and friends
This PR fixes comments left from #17481 , namely
- adds case selection to boost suite
- describes the case selection in documentation
Closesscylladb/scylladb#17721
* github.com:scylladb/scylladb:
docs: Add info about the ability to run specific test case
test.py: Support case selection for boost tests
the corresponding implementation of operator<< was dropped in
a40d3fc25b, so there is no needs to
keep this friend declaration anymore.
also, drop `include <ostream>`, as this header does not reference
any of the ostream types with the change above.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17743
* seastar 5d3ee980...a71bd96d (51):
> util: add formatter for optimized_optional<>
> build: search protobuf using package config
> reactor: Move pieces of scollectd to scollectd
> reactor: Remove write-only task_queue._current
> Add missing include in tests/unit/rpc_test.cc
> doc/io_tester.md: include request_type::unlink in the docs
> doc/io-tester.md: update obsolete information in io_tester docs
> io_tester/conf.yaml: include an example of request_type::unlink job
> io_tester: implement request_type::unlink
> reactor: Print correct errno on io_submit failure
> src/core/reactor.cc: qualify metric function calls with "sm::"
> build: add shard_id.hh to seastar library
> thread: speed up thread creation in debug mode
> include: add missing modules.hh import to shard_id.hh
> prometheus: avoid ambiguity when calling MetricFamily.set_name()
> util/log: add formatter for log_level
> util/log: use string_view for log_level_names
> perf: Calculate length of name column in perf tests
> rpc_test: add a test for inter-compressor communication
> rpc: in multi_algo_compressor_factory, propagate send_empty_frame
> rpc: give compressors a way to send something over the connection
> rpc: allow (and skip) empty compressed frames
> metrics: change value_vector type to std::deque
> HACKING.md: remove doc related to test_dist
> test/unit: do not check if __cplusplus > 201703L
> json_elements: s/foramted/formatted/
> iostream: Refactor input_stream::read_exactly_part
> add unit test to verify str.starts_with(str), str.ends_with(str) return true.
> str.starts_with(str) and str.ends_with(str) should return true, just like std::string
> rpc: Remove FrameType::header_and_buffer_type
> rpc: Defuturize FrameType::return_type
> rpc: Kill FrameType::get_size()
> treewide: put std::invocable<> constraints in template param list
> include: do not include unuser headers
> rpc: fix a deadlock in connection::send()
> iostream: Replace recursion by iteration in input_stream::read_exactly_part
> core/bitops.hh: use std::integral when appropriate
> treewide: include <concepts> instead of seastar/util/concepts.hh
> abortable_fifo: fix the indent
> treewide: expand `SEASTAR_CONCEPT` macro
> util/concepts: always define SEASTAR_CONCEPT
> file: Remove unused thread-pool arg from directory lister
> seastar-json2code: collect required_query_params using a list
> seastar-json2code: reduce the indent level
> seastar-json2code: indent the enum and array elements
> seastar-json2code: generate code for enum type using Template
> seastar-json2code: extract add_operation() out
> reactor: Re-ifdef SIGSEGV sigaction installing
> reactor: Re-ifdef reactor::enable_timer()
> reactor: Re-ifdef task_histogram_add_task()
> reactor: Re-ifdef install_signal_handler_stack()
Closesscylladb/scylladb#17714
This small series improves the Alternator tests for metrics:
1. Improves some comments in the test.
2. Restores a test that was previously hidden by two tests having the same name.
3. Adds tests for latency histogram metrics.
Closesscylladb/scylladb#17623
* github.com:scylladb/scylladb:
test/alternator: tests for latency metrics
test/alternator: improve comments and unhide hidden test
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `exception_container<..>`
and drop its operator<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for `utils::human_readable_value`,
and drop its operator<<
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for
* std::strong_ordering
* std::weak_ordering
* std::partial_ordering
and their operator<<:s are moved to test/lib/test_utils.{hh,cc}, as they
are only used by Boost.test.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
There are four stages left to handle: cleanup, cleanup_target, end_migration and revert_migration. All are handling removed nodes already, so the PR just extends the test.
fixes: #16527Closesscylladb/scylladb#17684
* github.com:scylladb/scylladb:
test/tablets_migration: Test revert_migration failure handling
test/tablets_migration: Test end_migration failure handling
test/tablets_migration: Test cleanup_target failure handling
test/tablets_migration: Test cleanup failure handling
test/tablets_migration: Prepare for do_... stages
test/tablets_migration: Add ability to removenode via any other node
test/tablets_migration: Wrap migration stages failing code into a helper class
storage_service: Add failure injection to crash cleanup_tablet
Instead of a functor, for those metrics that just return the value of an
existing member variable. This is ever so slightly more efficient than a
functor.
Closesscylladb/scylladb#17726
In test/alternator/test_metrics.py we had tests for the operation-count
metrics for different Alternator API operations, but not for the latency
histograms for these same operations. So this patch adds the missing
tests (and removes a TODO asking to do that).
Note that only a subset of the operations - PutItem, GetItem, DeleteItem,
UpdateItem, and GetRecords - currently have a latency history, and this
test verifies this. We have an issue (Refs #17616) about adding latency
histograms for more operations - at which point we will be able to expand
this test for the additional operations.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The original goal of this patch was to improve comments in
test/alternator/test_metrics.py, but while doing that I discovered
that one of the test functions was hidden by a second test with
the same name! So this patch also renames the second test.
The test continues to work after this patch - the hidden test
was successful.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Before this change, when user tried to utilize
'storage_service/ownership/{keyspace}' API with
keyspace parameter that uses tablets, then internal
error was thrown. The code was calling a function,
that is intended for vnodes: get_vnode_effective_replication_map().
This commit introduces graceful handling of such scenario and
extends the API to allow passing 'cf' parameter that denotes
table name.
Now, when keyspace uses tablets and cf parameter is not passed
a descriptive error message is returned via BAD_REQUEST.
Users cannot query ownership for keyspace that uses tablets,
but they can query ownership for a table in a given keyspace that uses tablets.
Also, new tests have been added to test/rest_api/test_storage_service.py and
to test/topology_experimental_raft/test_tablets.py in order to verify the behavior
with and without tablets enabled.
Fixes: https://github.com/scylladb/scylladb/issues/17342Closesscylladb/scylladb#17405
* github.com:scylladb/scylladb:
storage_service/ownership: discard get_ownership() requests when tablets enabled
storage_service/ownership/{keyspace}: handle requests when tablets are enabled
locator/effective_replication_map: make 'get_ranges(inet_address ep)' virtual
locator/tablets: add tablet_map::get_sorted_tokens()
pylib/rest_client.py: add ownership API to ScyllaRESTAPIClient
rest_api/test_storage_service: add simplistic tests of ownership API for vnodes
Seastar removed `task_queue::_current` in
258b11220d343d8c7ae1a2ab056fb5e202723cc8 . let's adapt scylla-gdb.py
accordingly. despite that `current_scheduling_group_ptr()` is an internal
API, it's been around for a while, and relatively stable. so let's use
it instead.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17720
The short series allows do_status_check to handle down nodes that don't have HOST_ID application state.
Fixes#16936Closesscylladb/scylladb#17024
* github.com:scylladb/scylladb:
gossiper: do_status_check: fixup indentation
gossiper: do_status_check: allow evicting dead nodes from membership with no host_id
gossiper: print the host_id when endpoint state goes UP/DOWN
gossiper: get_host_id: differentiate between no endpoint_state and no application_state
gms: endpoint_state: add get_host_id
gossiper: do_status_check: continue loop after evicting FatClient
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for internal types in service/storage_proxy.cc.
please note, `service::storage_proxy::remote::read_verb` is extracted out of
the outter class, because, the class's implementation formats `read_verb` in this
class. so we have to put the formatter at the place where its callers can see.
that's why it is moved up and out of `service::storage_proxy::remote`.
some of the operator<<:s are preserved, as they are still being used by
the existing formatters, for instance, the one for
`seastar::shared_ptr<>`, which is used to print
`seastar::shared_ptr<service::paxos_response_handler>`.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17708
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for `bound_kind` and `bound_view`,
and drop the latter's operator<<.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17706
Shard-level latencies generate a lot of metrics. This patch reduces the
the number of latencies reported by Alternator while keeping the same
functionality.
On the shard level, summaries will be reported instead of histograms.
On the instance level, an aggregated histogram will be reported.
Summaries, histograms, and counters are marked with skip_when_empty.
Fixes#12230Closesscylladb/scylladb#17581
This change introduces a logic, that is responsible
for checking if tablets are enabled for any of
keyspaces when get_ownership() is invoked.
Without it, the result would be calculated
based solely on sorted_tokens() which was
invalid.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Before this change, when user tried to utilize
'storage_service/ownership/{keyspace}' API with
keyspace parameter that uses tablets, then internal
error was thrown. The code was calling a function,
that is intended for vnodes: get_vnode_effective_replication_map().
This commit introduces graceful handling of such scenario and
extends the API to allow passing 'cf' parameter that denotes
table name.
Now, when keyspace uses tablets and cf parameter is not passed
a descriptive error message is returned via BAD_REQUEST.
Users cannot query ownership for keyspace that uses tablets,
but they can query ownership for a table in a given keyspace that uses tablets.
Also, new tests have been added to test/rest_api/test_storage_service.py and
to test/topology_experimental_raft/test_tablets.py in order to verify the behavior
with and without tablets enabled.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Before this patch, the mentioned function was a specific
member of vnode_effective_replication_strategy class.
To allow its usage also when tablets are enabled it was
shifted to the base class - effective_replication_strategy
and made pure virtual to force the derived classes to
implement it.
It is used by 'storage_service::get_ranges_for_endpoint()'
that is used in calculation of effective ownership. Such
calculation needs to be performed also when tablets are
enabled.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change introudces a new member function that
returns a vector of sorted tokens where each pair of adjacent
elements depicts a range of tokens that belong to tablet.
It will be used to produce the equivalent of sorted_tokens() of
vnodes when trying to use dht::describe_ownership() for tablets.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change adds a member function that can be used
to access 'storage_service/ownership' API.
It will be used by tests that need to access this API.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
This change is intended to introduce tests for vnodes for
the following API paths:
- 'storage_service/ownership'
- 'storage_service/ownership/{keyspace}'
In next patches the logic that is tested will be adjusted
to work correctly when tablets are enabled. This is a safety
net that ensures that the logic is not broken.
Refs: scylladb#17342
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.
in this change, we define formatters for
* reader_permit::state
* reader_resources
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17707
instead of using fmt::runtime format string, use compile-time
format string, so that we can have compile-time format check provided
by {fmt}.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17709
The test.py usage is documented, the ability to run a specific test by
its name is described in doc. Extend it with the new ability to run
specific test case as well.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Boost tests support case-by-case execution and always turn it on -- when
run, boost test is split into parallel-running sub-tests each with the
specific case name.
This patch tunes this, so that when a test is run like
test.py boost/testname::casename
No parallel-execution happens, but instead just the needed casename is
run. Example of selection:
test.py --mode=${mode} boost/bptree_test::test_cookie_find
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This stage is also the error path that starts from write_both_read_old,
so check this failure in two steps -- first fail the latter stage in one
of the nodes, then fail the former in another.
For that one more node in the cluster is needed.
Also, to avoid name conflicts, the do_revert_migration pseudo stage name
is used.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This stage is pure barrier. Barriers already take ignored nodes into
account, so do the fail-injector, so just wire the stage name into the
test.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This stage is error path, so in order to fail it we need to fail some
other stage prior to that. This leads to the testing sequence of
1. fail streaming via source node
2. stop and remove source node to let state machine proceed
3. fail cleanup_target on the destination node
4. stop and remove destination node
First thing to note here, is that the test doesn't fail source node for
cleanup_target stage, symmetrically to how it does for cleanup stage.
Next, since we're removing two nodes, the cluster is equipeed with more
nodes nodes to have raft quorum.
Finally, since remove of source node doesn't finish until tablet
migration finishes, it's impossible to remove destination node via the
same node-0, so the 2nd removenode happens via node-3.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The handling itself is already there -- if the leaving node is excluded
the cleanup stage resolves immediately. So just add a code that
validates that.
Also, skip testing of pending replica failure during cleanup stage, as
it doesn't really participate in it any longer.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>