Return 400 Bad Request instead of 500 Internal Server Error
when user requests task or module which does not exist through
task manager and task manager test apis.
Closes#14166
* github.com:scylladb/scylladb:
test: add test checking response status when requested module does not exist
api: fix indentation
api: throw bad_param_exception when requested task/module does not exists
The call is generic enough not to drop the sstable itself on return so
that callers can do whatever they need with it. The only today's caller
is API which will convert sstables to filenames on its own
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In task manager and task manager test rest apis when a task or module
which does not exist is requested, we get Internal Server Error.
In such cases, wrap thrown exceptions in bad_param_exception
to respond with Bad Request code.
Modify test accordingly.
Task manager compaction tasks need table names for logs.
Thus, compaction tasks store table infos instead of table ids.
get_table_ids function is deleted as it isn't used anywhere.
Some assorted cleanups here: consolidation of schema agreement waiting
into a single place and removing unused code from the gossiper.
CI: https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/1458/
Reviewed-by: Konstantin Osipov <kostja@scylladb.com>
* gleb/gossiper-cleanups of github.com:scylladb/scylla-dev:
storage_service: avoid unneeded copies in on_change
storage_service: remove check that is always true
storage_service: rename handle_state_removing to handle_state_removed
storage_service: avoid string copy
storage_service: delete code that handled REMOVING_TOKENS state
gossiper: remove code related to advertising REMOVING_TOKEN state
migration_manager: add wait_for_schema_agreement() function
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `api::table_info` without the help of `operator<<`.
but the corresponding `operator<<()` is preserved in this change, as we
still have lots of callers relying on this << operator instorage_service.cc
where std::vector<table_info> is formatted using operator<<(ostream&, const Range&)
defined in to_string.hh. we could have used fmt/ranges.h to print the
std::vector<table_info>. but the combination of operator<<(ostream&, const Range&)
and FMT_DEPRECATED_OSTREAM renders this impossible. because
unlike the builtin range formatter specializations, the fallback formatter
synthesized from the operator<< does not have brackets defined for
the range printer. the brackets are used as the left and right marks
of the range, for instance, the array-alike containers are printed
like [1,2,3], while the tuple-alike containers are printed like
(1,2,3). once we are allowed to remove FMT_DEPRECATED_OSTREAM, we
should be able to use the builtin range formatter, and remove the
operator<< for api::table_info by then.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13975
`check_and_repair_cdc_streams` is an existing API which you can use when the
current CDC generation is suboptimal, e.g. after you decommissioned a node the
current generation has more stream IDs than you need. In that case you can do
`nodetool checkAndRepairCdcStreams` to create a new generation with fewer
streams.
It also works when you change number of shards on some node. We don't
automatically introduce a new generation in that case but you can use
`checkAndRepairCdcStreams` to create a new generation with restored
shard-colocation.
This PR implements the API on top of raft topology, it was originally
implemented using gossiper. It uses the `commit_cdc_generation` topology
transition state and a new `publish_cdc_generation` state to create new CDC
generations in a cluster without any nodes changing their `node_state`s in the
process.
Closes#13683
* github.com:scylladb/scylladb:
docs: update topology-over-raft.md
test: topology_experimental_raft: test `check_and_repair_cdc` API
raft topology: implement `check_and_repair_cdc_streams` API
raft topology: implement global request handling
raft topology: introduce `prepare_new_cdc_generation_data`
raft_topology: `get_node_to_work_on_opt`: return guard if no node found
raft topology: remove `node_to_work_on` from `commit_cdc_generation` transition
raft topology: separate `publish_cdc_generation` state
raft topology: non-node-specific `exec_global_command`
raft topology: introduce `start_operation()`
raft topology: non-node-specific `topology_mutation_builder`
topology_state_machine: introduce `global_topology_request`
topology_state_machine: use `uint16_t` for `enum_class`es
raft topology: make `new_cdc_generation_data_uuid` topology-global
in this change, the type of the "generation" field of "sstable" in the
return value of RESTful API entry point at
"/storage_service/sstable_info" is changed from "long" to "string".
this change depends on the corresponding change on tools/jmx submodule,
so we have to include the submodule change in this very commit.
this API is used by our JMX exporter, which in turn exposes the
SSTable information via the "StorageService.getSSTableInfo" mBean
operation, which returns the retrieved SSTable info as a list of
CompositeData. and "generation" is a field of an element in the
CompositeData. in general, the scylla JMX exporter is consumed
by the nodetool, which prints out returned SSTable info list with
a pretty formatted table, see
tools/java/src/java/org/apache/cassandra/tools/nodetool/SSTableInfo.java.
the nodetool's formatter is not aware of the schema or type of the
SSTables to be printed, neither does it enforce the type -- it just
tries it best to pretty print them as a tabular.
But the fields in CompositeData is typed, when the scylla JMX exporter
translates the returned SSTables from the RESTful API, it sets the
typed fields of every `SSTableInfo` when constructing `PerTableSSTableInfo`.
So, we should be consistent on the type of "generation" field on both
the JMX and the RESTful API sides. because we package the same version
of scylla-jmx and nodetool in the same precompiled tarball, and enforce
the dependencies on exactly same version when shipping deb and rpm
packages, we should be safe when it comes to interoperability of
scylla-jmx and scylla. also, as explained above, nodetool does not care
about the typing, so it is not a problem on nodetool's front.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13834
Task manager's tasks covering scrub compaction on top,
shard and table level.
For this levels we have common scrub tasks for each scrub
mode since they share code. Scrub modes will be differentiated
on compaction group level.
Closes#13694
* github.com:scylladb/scylladb:
test: extend test_compaction_task.py to test scrub compaction
compaction: add table_scrub_sstables_compaction_task_impl
compaction: add shard_scrub_sstables_compaction_task_impl
compaction: add scrub_sstables_compaction_task_impl
api: get rid of unnecessary std::optional in scrub
compaction: rename rewrite_sstables_compaction_task_impl
The original API is gossiper-based. Since we're moving CDC generations
handling to Raft-based topology, we need to implement this API as well.
For now the API creates a new generation unconditionally, in a follow-up
I'll introduce a check to skip the creation if the current generation is
optimal.
* replace generation_type::value() with generation_type::as_int()
* drop generation_value()
because we will switch over to UUID based generation identifier, the member
function or the free function generation_value() cannot fulfill the needs
anymore. so, in this change, they are consolidated and are replaced by
"as_int()", whose name is more specific, and will also work and won't be
misleading even after switching to UUID based generation identifier. as
`value()` would be confusing by then: it could be an integer or a UUID.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This PR introduces an experimental feature called "tablets". Tablets are
a way to distribute data in the cluster, which is an alternative to the
current vnode-based replication. Vnode-based replication strategy tries
to evenly distribute the global token space shared by all tables among
nodes and shards. With tablets, the aim is to start from a different
side. Divide resources of replica-shard into tablets, with a goal of
having a fixed target tablet size, and then assign those tablets to
serve fragments of tables (also called tablets). This will allow us to
balance the load in a more flexible manner, by moving individual tablets
around. Also, unlike with vnode ranges, tablet replicas live on a
particular shard on a given node, which will allow us to bind raft
groups to tablets. Those goals are not yet achieved with this PR, but it
lays the ground for this.
Things achieved in this PR:
- You can start a cluster and create a keyspace whose tables will use
tablet-based replication. This is done by setting `initial_tablets`
option:
```
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy',
'replication_factor': 3,
'initial_tablets': 8};
```
All tables created in such a keyspace will be tablet-based.
Tablet-based replication is a trait, not a separate replication
strategy. Tablets don't change the spirit of replication strategy, it
just alters the way in which data ownership is managed. In theory, we
could use it for other strategies as well like
EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy
is augmented to support tablets.
- You can create and drop tablet-based tables (no DDL language changes)
- DML / DQL work with tablet-based tables
Replicas for tablet-based tables are chosen from tablet metadata
instead of token metadata
Things which are not yet implemented:
- handling of views, indexes, CDC created on tablet-based tables
- sharding is done using the old method, it ignores the shard allocated in tablet metadata
- node operations (topology changes, repair, rebuild) are not handling tablet-based tables
- not integrated with compaction groups
- tablet allocator piggy-backs on tokens to choose replicas.
Eventually we want to allocate based on current load, not statically
Closes#13387
* github.com:scylladb/scylladb:
test: topology: Introduce test_tablets.py
raft: Introduce 'raft_server_force_snapshot' error injection
locator: network_topology_strategy: Support tablet replication
service: Introduce tablet_allocator
locator: Introduce tablet_aware_replication_strategy
locator: Extract maybe_remove_node_being_replaced()
dht: token_metadata: Introduce get_my_id()
migration_manager: Send tablet metadata as part of schema pull
storage_service: Load tablet metadata when reloading topology state
storage_service: Load tablet metadata on boot and from group0 changes
db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata()
migration_notifier: Introduce before_drop_keyspace()
migration_manager: Make prepare_keyspace_drop_announcement() return a future<>
test: perf: Introduce perf-tablets
test: Introduce tablets_test
test: lib: Do not override table id in create_table()
utils, tablets: Introduce external_memory_usage()
db: tablets: Add printers
db: tablets: Add persistence layer
dht: Use last_token_of_compaction_group() in split_token_range_msb()
locator: Introduce tablet_metadata
dht: Introduce first_token()
dht: Introduce next_token()
storage_proxy: Improve trace-level logging
locator: token_metadata: Fix confusing comment on ring_range()
dht, storage_proxy: Abstract token space splitting
Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries"
db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms()
db: Introduce get_non_local_vnode_based_strategy_keyspaces()
service: storage_proxy: Avoid copying keyspace name in write handler
locator: Introduce per-table replication strategy
treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type
locator: Introduce effective_replication_map
locator: Rename effective_replication_map to vnode_effective_replication_map
locator: effective_replication_map: Abstract get_pending_endpoints()
db: Propagate feature_service to abstract_replication_strategy::validate_options()
db: config: Introduce experimental "TABLETS" feature
db: Log replication strategy for debugging purposes
db: Log full exception on error in do_parse_schema_tables()
db: keyspace: Remove non-const replication strategy getter
config: Reformat
Add a test for get/enable/disable auto_compaction via to column_family api.
And add log messages for admin operations over that api.
Closes#13566
* github.com:scylladb/scylladb:
api: column_family: add log messages for admin operation
test: rest_api: add test_column_family
It's meant to be used in places where currently
get_non_local_strategy_keyspaces() is used, but work only with
keyspaces which use vnode-based replication strategy.
This series cleans up the generation and value types used in gms / gossiper.
Currently we use a blend of int, int32_t, and int64_t around messaging.
This change defines gms::generation_type and gms::version_type as int32_t
and add check in non-release modes that the respective int64 value passed over messaging do not overflow 32 bits.
Closes#12966
* github.com:scylladb/scylladb:
gossiper: version_generator: add {debug_,}validate_gossip_generation
gms: gossip_digest: use generation_type and version_type
gms: heart_beat_state: use generation_type and version_type
gms: versioned_value: use version_type
gms: version_generator: define version_type and generation_type strong types
utils: move generation-number to gms
utils: add tagged_integer
gms: versioned_value: make members private
scylla-gdb: add get_gms_versioned_value
gms: versioned_value: delete unused compare_to function
gms: gossip_digest: delete unused compare_to function
The only reason why it's there (right next to compaction_fwd.hh) is
because the database::table_truncate_state subclass needs the definition
of compaction_manager::compaction_reenabler subclass.
However, the former sub is not used outside of database.cc and can be
defined in .cc. Keeping it outside of the header allows dropping the
compaction_manager.hh from database.hh thus greatly reducing its fanout
over the code (from ~180 indirect inclusions down to ~20).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13622
Adjust scylla-gdb.get_gms_version_value
to get the versioned_value version as version_type
(utils::tagged_integer).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
and provide accessor functions to get them.
1. So they can't be modified by mistake, as the versioned value is
immutable. A new value must have a higher version.
2. Before making the version a strong gms::version_type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Move gms::arrival_window to api/failure_detector which is its only user.
and get rid of the rest, which is not used, now that we use direct_failure_detector instead.
TODO: integare direct_failure_detector with failure_detector api.
Closes#13576
* github.com:scylladb/scylladb:
gms: get rid of unused failure_detector
api: failure_detector: remove false dependency on failure_detector::arrival_window
test: rest_api: add test_failure_detector
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:
* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
hood, so it can use the `operator<<` to stringify the given object.
but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
using `fmt::to_string()` helps us to remove yet another dependency
on `operator<<`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13611
Up until 0ef33b71ba
get_endpoint_phi_values retrieved arrival samples
from gms::get_arrival_samples(). That function was
removed since it returned a constant ampty map.
This patch returns empty results without relying
on failure_detector::arrival_window, so the latter can
be retired altogether.
As Tomasz Grabiec <tgrabiec@scylladb.com> said:
> I don't think the logic of arrival_window belongs to api,
> it belongs to the failure detector. If there is no longers
> a failure detector, there should be no arrival_window.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Similarly to previous patch, but from another routes group. The storage
service API calls mainly use storage service, but one place needs proxy
to call recalculate_schema_version() with
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
So that the routes referencing and using ctx.sp don't step on a proxy
that's going to be removed (not now, but some time later) fron under
them on shutdown.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Similar to the storage_service api, print a log message
for admin operations like enabling/disabling auto_compaction,
running major compaction, and setting the table compaction
strategy.
Note that there is overlap in functionality
between the storage_service and the column_family api entry points.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Such namespace-wide imports can create conflicts between names that
are the same in seastar and std, such as {std,seastar}::future and
{std,seastar}::format, since we also have 'using namespace seastar'.
Replace the namespace imports with explicit qualification, or with
specific name imports.
Closes#13528
As a first step towards using host_id to identify nodes instead of ip addresses
this series introduces a node abstraction, kept in topology,
indexed by both host_id and endpoint.
The revised interface also allows callers to handle cases where nodes
are not found in the topology more gracefully by introducing `find_node()` functions
that look up nodes by host_id or inet_address and also get a `must_exist` parameter
that, if false (the default parameter value) would return nullptr if the node is not found.
If true, `find_node` throws an internal error, since this indicates a violation of an internal
assumption that the node must exist in the topology.
Callers that may handle missing nodes, should use the more permissive flavor
and handle the !find_node() case gracefully.
Closes#11987
* github.com:scylladb/scylladb:
topology: add node state
topology: remove dead code
locator: add class node
topology: rename update_endpoint to add_or_update_endpoint
topology: define get_{rack,datacenter} inline
shared_token_metadata: mutate_token_metadata: replicate to all shards
locator: endpoint_dc_rack: refactor default_location
locator: endpoint_dc_rack: define default operator==
test: storage_proxy_test: provide valid endpoint_dc_rack
Refactor the thread_local default_location out of
topology::get_location so it can be used elsewhere.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `bytes` and `gms::inet_address` without using ostream<<. also, this change removes all existing callers of `operator<<(ostream, const bytes &)` and `operator<<(ostream, const gms::inet_address&)`.
`gms::inet_address` related changes are included here in hope to demonstrate the usage of delimiter specifier of `fmt_hex` 's formatter.
Refs #13245Closes#13275
* github.com:scylladb/scylladb:
gms/inet_address: implement operator<< using fmt::formatter
treewide: use fmtlib to format gms::inet_address
gms/inet_address: specialize fmt::formatter<gms::inet_address>
bytes: implement formatting helpers using formatter
bytes: specialize fmt::formatter<bytes>
bytes: specialize fmt::formatter<fmt_hex>
bytes: mark fmt_hex::v `const`
the goal of this change is to reduce the dependency on
`operator<<(ostream&, const gms::inet_address&)`.
this is not an exhaustive search-and-replace change, as in some
caller sites we have other dependencies to yet-converted ostream
printer, we cannot fix them all, this change only updates some
caller of `operator<<(ostream&, const gms::inet_address&)`.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Task manager task implementations of classes that cover
cleanup keyspace compaction which can be started through
/storage_service/keyspace_compaction/ api.
Top level task covers the whole compaction and creates child
tasks on each shard.
Closes#12712
* github.com:scylladb/scylladb:
test: extend test_compaction_task.py to test cleanup compaction
compaction: create task manager's task for cleanup keyspace compaction on one shard
compaction: create task manager's task for cleanup keyspace compaction
api: add get_table_ids to get table ids from table infos
compaction: create cleanup_compaction_task_impl
this change is a leftover of 063b3be,
which failed to include the changes in the header files.
it turns out we have `using namespace httpd;` in seastar's
`request_parser.rl`, and we should not rely on this statement to
expose the symbols in `seatar::httpd` to `seastar` namespace.
in this change,
* api/*.hh: all httpd symbols are referenced by `httpd::*`
instead of being referenced as if they are in `seastar`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change tries to reduce the number of callers using operator<<()
for printing UUID. they are found by compiling the tree after commenting
out `operator<<(std::ostream& out, const UUID& uuid)`. but this change
alone is not enough to drop all callers, as some callers are using
`operator<<(ostream&, const unordered_map&)` and other overloads to
print ranges whose elements contain UUID. so in order to limit the
scope of the change, we are not changing them here.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Add an API call to wait for all shards to reach the current shard 0
gossiper version. Throws when timeout is reached.
Closes#12540
* github.com:scylladb/scylladb:
api: gossiper: fix alive nodes
gms, service: lock live endpoint copy
gms, service: live endpoint copy method
Fix API call to wait for all shards to reach the current shard 0
gossiper version. Throws when timeout is reached.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>