Update docs for backup procedure to use `DESC SCHEMA WITH INTERNALS`
instead of plain `DESC SCHEMA`.
Add a note to use cqlsh in a proper version (at least 6.0.19).
Closesscylladb/scylladb#18953
This effectively removes "finally" block so if
authorized_prepared_cache.stop() resolves with exception, the
prepared_cache.stop() is skipped. But that's not a problem -- even if
.stop() throws the shole scylla stop aborts so we don't really care if
it was clean or not.
Also, authorized_prepared_cache.stop() closes the gate and cancels the
timer. None of those can resolve with exception.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#19001
Currently, there is no indication of tablets in the logged KSMetaData.
Print the tablets configuration of either the`initial` number of tablets,
if enabled, or {'enabled':false} otherwise.
For example:
```
migration_manager - Create new Keyspace: KSMetaData{name=tablets_ks, strategyClass=org.apache.cassandra.locator.NetworkTopologyStrategy, strategyOptions={"datacenter1": "1"}, cfMetaData={}, durable_writes=true, tablets={"initial":0}, userTypes=org.apache.cassandra.config.UTMetaData@0x600004d446a8}
migration_manager - Create new Keyspace: KSMetaData{name=vnodes_ks, strategyClass=org.apache.cassandra.locator.NetworkTopologyStrategy, strategyOptions={"datacenter1": "1"}, cfMetaData={}, durable_writes=true, tablets={"enabled":false}, userTypes=org.apache.cassandra.config.UTMetaData@0x600004c33ea8}
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#18998
This patch adds a test reproducing for the known issue #7963, where
after adding a secondary-index to a table, queries might immediately
start to use this index - even before it is built - and produce wrong
results.
The issue is still open and unfixed, so the new test is marked "xfail".
Interestingly, even though Cassandra claims to have found and fixed
a similar bug in 2015 (CASSANDRA-8505), this test also fails on
Cassandra - trying a query right after CREATE INDEX and before it
was fully built may cause the query to fail.
Refs #7963
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#18993
tablet snapshot, used by migration, can race with compaction and
can find files deleted. That won't cause data loss because the
error is propagated back into the coordinator that decides to
retry streaming stage. So the consequence is delayed migration,
which might in turn reduce node operation throughput (e.g.
when decommissioning a node). It should be rare though, so
shouldn't have drastic consequences.
Fixes#18977.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#18979
Incremented the components_memory_reclaim_threshold config's default
value to 0.2 as the previous value was too strict and caused unnecessary
eviction in otherwise healthy clusters.
Fixes#18607
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#18964
Currently the loader is called via API, which inherits the maintenance scheduling group from API http server. The loader then can either do load_and_stream() or call (legacy) distributed_loader::upload_new_sstables(). The latter first switches into streaming scheduling group, but the former doesn't and continues running in the maintenance one.
All this is not really a problem, because streaming sched group and maintenance sched group is one group under two different variable names. However, it's messy and worth delegating the sched group switch (even if it's a no-op) to the sstables-loader. As a nice side effect, this patch removes one place that uses database as proxy object to get configuration parameters.
Closesscylladb/scylladb#18928
* github.com:scylladb/scylladb:
sstables-loader: Run loading in its scheduling group
sstables-loader: Add scheduling group to constructor
... and replace it with boolean enable_tablets option. All the places
in the code are patched to check the latter option instead of the former
feature.
The option is OFF by default, but the default scylla.yaml file sets this
to true, so that newly installed clusters turn tablets ON.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#18898
before this change, we rely on `seastar/util/std-compat.hh` to
include the used headers provided by stdandard library. this was
necessary before we moved to a C++20 compliant standard library
implementation. but since Seastar has dropped C++17 support. its
`seastar/util/std-compat.hh` is not responsible for providing these
headers anymore.
so, in this change, we include the used header directly instead
of relying on `seastar/util/std-compat.hh`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18986
before this change, in order to avoid repeating/hardwiring the
compiling options set by Seastar, we just inherit the compiling
options of Seastar for building Abseil, as the former exposes the
options to enable sanitizers.
this works fine, despite that, strictly speaking, not all options
are necessary for building abseil, as abseil is not a Seastar
application -- it is just a C++ library.
but when we introduce dependencies which are only generated at
build time, and these dependencies are passed to the compiler
at build time, this breaks the build of Abseil. because these
dependencies are exposed by the Seastar's .pc file, and consumed
by Abseil. when building Abseil, apparently, the building process
driven by ninja is not started yet, so we are not able to build
Abseil with these settings due to missing dependencies.
so instead of inheriting the compiling options from Seastar, just
set the sanitizer related compiling options directly, to avoid
referencing these missing dependencies.
the upside is that we pass a much smaller set of compiling options
to compiler when building Abseil, the downside is that we hardwire
these options related to sanitizer manually, they are also detected
by Seastar's building system. but fortunately, these options are
relatively stable across the building environements we support.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18987
In the test_mv_topology_change case, we use an injection to
delay the view updates application, so that the ERMs have
a chance to change in the process. This injection was also
enabled on a new node in the test, which was later decommissioned.
During the shutdown, writes were still being performed, causing
view update generation and delays due to the injection which in
turn delayed the node shutdown, causing the test to timeout.
This patch removes the injection for the node being shut down.
At the same time, the force_gossip_topology_changes=True option
is also removed from its config, but for that option it's enough
to enable on the first node in the cluster and all nodes use it.
Fixes: https://github.com/scylladb/scylladb/issues/18941Closesscylladb/scylladb#18958
The Alternator test suite usually runs on a specific configuration of
Scylla set up by test.py or test/alternator/run. However, we do consider
it an important design goal of this test suite that developers should be
able to run these tests against any DynamoDB-API implementation, including
any version Scylla manually run by the developer in *any way* he or she
pleases.
The recent commit dc80b5dafe changed the way
we retrieve the configured autentication key, which is needed if Scylla is
run with --alternator-enforce-authorization. However, the new code assumed
that Scylla was also run with
--authenticator PasswordAuthenticator --authorizer CassandraAuthorizer
so that the default role of "cassandra" has a valid, non-null, password
(namely, "cassandra"). If the developer ran Scylla manually without
these options, the test initialization code broke, and all tests in the
suite failed.
This patch fixes this breakage. You can now run the Alternator test
suite against Scylla run manually without any of the aforementioned
options, and everything will work except some tests in test_authorization.py
will fail as expected.
This patch has no affect on the usual test.py or test/alternator/run
runs, as they already run Scylla with all the aforementioned options
and weren't exposed to the problem fixed here.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#18957
storage_proxy is responsible for intersecting the range of the read
with tablets, and calling replica with a single tablet range, therefore
it makes sense to avoid touching memtables of tablets that don't
intersect with a particular range.
Note this is a performance issue, not correctness one, as memtable
readers that don't intersect with current range won't produce any
data, but cpu is wasted until that's realized (they're added to list
of readers in mutation_reader_merger, more allocations, more data
sources to peek into, etc).
That's also important for streaming e.g. after decommission, that
will consume one tablet at a time through a reader, so we don't want
memtables of streamed tablets (that weren't cleaned up yet) to
be consumed.
Refs #18904.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#18907
This change supports changing replication factor in tablets-enabled keyspaces.
This covers both increasing and decreasing the number of tablets replicas through
first building topology mutations (`alter_keyspace_statement.cc`) and then
tablets/topology/schema mutations (`topology_coordinator.cc`).
For the limitations of the current solution, please see the docs changes attached to this PR.
Fixes: #16129Closesscylladb/scylladb#16723
* github.com:scylladb/scylladb:
test: Do not check tablets mutations on nodes that don't have them
test: Fix the way tablets RF-change test parses mutation_fragments
test/tablets: Unmark RF-changing test with xfail
docs: document ALTER KEYSPACE with tablets
Return response only when tablets are reallocated
cql-pytest: Verify RF is changes by at most 1 when tablets on
cql3/alter_keyspace_statement: Do not allow for change of RF by more than 1
Reject ALTER with 'replication_factor' tag
Implement ALTER tablets KEYSPACE statement support
Parameterize migration_manager::announce by type to allow executing different raft commands
Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks
Extend system.topology with 3 new columns to store data required to process alter ks global topo req
Allow query_processor to check if global topo queue is empty
Introduce new global topo `keyspace_rf_change` req
New raft cmd for both schema & topo changes
Add storage service to query processor
tablets: tests for adding/removing replicas
tablet_allocator: make load_balancer_stats_manager configurable by name
`tablets_options->erase(it);` invalidates `it`, but it's still referred
to later in the code in the last `else`, and when that code is invoked,
we get a `heap-use-after-free` crash.
Fixes: #18926Closesscylladb/scylladb#18936
This patch cleans up a bit the code in Alternator which splits up
the operation's X-Amz-Target header (the second part of it is the
name of the operation, e.g., CreateTable).
The patch doesn't change any functionality or change performance in
any meaningful way. I was just reviewing this code and was annoyed by
the unnecessary variable and unnecessary creation of strings and
vectors for such a simple operation - and wanted to clean it up.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#18830
The test creates two partitions and passes them through the reader, but
the partitions are out-of-order. This is benign but best to fix it
anyway.
Found after bumping validation level inside the compactor.
Closesscylladb/scylladb#18848
Tests in test_tombstone_gc.py are parametrized with string instead
of bool values. Fix that. Use the value to create a keyspace with
or without tablets.
Fixes: #18888.
Closesscylladb/scylladb#18893
otherwise when compiling with the new seastar, which removed
`#include <variant>` from `std-compat.hh`, the {mode}-headers
target would fail to build, like:
```
./data_dictionary/storage_options.hh:34:29: error: no template named 'variant' in namespace 'std'
10:45:15 using value_type = std::variant<local, s3>;
10:45:15 ~~~~~^
10:45:15 ./data_dictionary/storage_options.hh:35:5: error: unknown type name 'value_type'; did you mean 'std::_Bit_const_iterator::value_type'?
10:45:15 value_type value = local{};
10:45:15 ^~~~~~~~~~
10:45:15 std::_Bit_const_iterator::value_type
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18921
This commit adds the information that the manual recovery procedure
is not supported if tablets are enabled.
In addition, the content in the Manual Recovery Procedure is reorganized
by adding the Prerequisites and Procedure subsections - in this way,
we can limit the number of Note and Warning boxes that made the page
hard to follow.
Fixes https://github.com/scylladb/scylladb/issues/18895Closesscylladb/scylladb#18935
When calculating the base-view mapping while the topology
is changing, we may encounter a situation where the base
table noticed the change in its effective replication map
while the view table hasn't, or vice-versa. This can happen
because the ERM update may be performed during the preemption
between taking the base ERM and view ERM, or, due to f2ff701,
the update may have just been performed partially when we are
taking the ERMs.
Until now, we assumed that the ERMs are synchronized while calling
finding the base-view endpoint mapping, so in particular, we were
using the topology from the base's ERM to check the datacenters of
all endpoints. Now that the ERMs are more likely to not be the same,
we may try to get the datacenter of a view endpoint that doesn't
exist in the base's topology, causing us to crash.
This is fixed in this patch by using the view table's topology for
endpoints coming from the view ERM. The mapping resulting from the
call might now be a temporary mapping between endpoints in different
topologies, but it still maps base and view replicas 1-to-1.
Fixes: #17786Fixes: #18709Closesscylladb/scylladb#18816
This series ignores errors in `load_history()` to prevent `abort_requested_exception` coming from `get_repair_module().check_in_shutdown()` from escaping during `repair_service::stop()`, causing
```
repair_service::~repair_service(): Assertion `_stopped' failed.
```
Fixes https://github.com/scylladb/scylladb/issues/18889
Backport to 6.0 required due to 523895145dClosesscylladb/scylladb#18890
* github.com:scylladb/scylladb:
repair: load_history: warn and ignore all errors
repair_service: debug stop
The check is performed by selecting from mutation_fragments(table), but
it's known that this query crashes Scylla when there's no tablet replica
on that node.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When the test changes RF from 2 to 3, the extra node executes "rebuild"
transition which means that it streams tablets replicas from two other
peers. When doing it, the node receives two sets of sstables with
mutations from the given tablet. The test part that checks if the extra
node received the mutations notices two mutation fragments on the new
replica and errorneously fails by seeing, that RF=3 is not equal to the
number of mutations found, which is 4.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Up until now we waited until mutations are in place and then returned
directly to the caller of the ALTER statement, but that doesn't imply
that tablets were deleted/created, so we must wait until the whole
processing is done and return only then.
We want to ensure that when the replication factor
of a keyspace changes, it changes by at most 1 per DC
if it uses tablets. The rationale for that is to make
sure that the old and new quorums overlap by at least
one node.
After these changes, attempts to change the RF of
a keyspace in any DC by more than 1 will fail.
This patch removes the support for the "wildcard" replication_factor
option for ALTER KEYSPACE when the keyspace supports tablets.
It will still be supported for CREATE KEYSPACE so that a user doesn't
have to know all datacenter names when creating the keyspace,
but ALTER KEYSPACE will require that and the user will have to
specify the exact change in replication factors they wish to make by
explicitly specifying the datacenter names.
Expanding the replication_factor option in the ALTER case is
unintuitive and it's a trap many users fell into.
See #8881, #15391, #16115
This commit adds support for executing ALTER KS for keyspaces with
tablets and utilizes all the previous commits.
The ALTER KS is handled in alter_keyspace_statement, where a global
topology request in generated with data attached to system.topology
table. Then, once topology state machine is ready, it starts to handle
this global topology event, which results in producing mutations
required to change the schema of the keyspace, delete the
system.topology's global req, produce tablets mutations and additional
mutations for a table tracking the lifetime of the whole req. Tracking
the lifetime is necessary to not return the control to the user too
early, so the query processor only returns the response while the
mutations are sent.
Since ALTER KS requires creating topology_change raft command, some
functions need to be extended to handle it. RAFT commands are recognized
by types, so some functions are just going to be parameterized by type,
i.e. made into templates.
These templates are instantiated already, so that only 1 instances of
each template exists across the whole code base, to avoid compiling it
in each translation unit.
Because ALTER KS will result in creating a global topo req, we'll have
to pass the req data to topology coordinator's state machine, and the
easiest way to do it is through sytem.topology table, which is going to
be extended with 3 extra columns carrying all the data required to
execute ALTER KS from within topology coordinator.
With current implementation only 1 global topo req can be executed at a
time, so when ALTER KS is executed, we'll have to check if any other
global topo req is ongoing and fail the req if that's the case.
This PR ensures that CDC keeps working correctly in the recovery
mode after leaving the raft-based topology.
We update `system.cdc_local` in `topology_state_load` to ensure
a node restarting in the recovery mode sees the last CDC generation
created by the topology coordinator.
Additionally, we extend the topology recovery test to verify
that the CDC keeps working correctly during the whole recovery
process. In particular, we test that after restarting nodes in the
recovery mode, they correctly use the active CDC generation created
by the topology coordinator.
Fixesscylladb/scylladb#17409Fixesscylladb/scylladb#17819Closesscylladb/scylladb#18820
* github.com:scylladb/scylladb:
test: test_topology_recovery_basic: test CDC during recovery
test: util: start_writes_to_cdc_table: add FIXME to increase CL
test: util: start_writes_to_cdc_table: allow restarting with new cql
storage_service: update system.cdc_local in topology_state_load
This patch fixes two bugs in `test_topology_ops`:
1. The values of `tablets_enabled` were nonempty strings, so they
always evaluated to `True` in the if statement responsible for
enabling writing workers only if tablets are disabled. Hence, the
writing workers were always disabled.
2. The `topology_experimental_raft suite` uses tablets by default,
so we need a config with empty `experimental_features` to disable
them.
Ensuring this test works with and without tablets is considered
a part of 6.0, so we should backport this patch.
Closesscylladb/scylladb#18900
Now the loading code has two different paths, and only one of them
switches sched group. It's cleaner and more natural to switch the sched
group in the loader itself, so that all code paths run in it and don't
care switching.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>