Migration manager installs several feature change listeners:
if (this_shard_id() == 0) {
_feature_listeners.push_back(_feat.cluster_supports_view_virtual_columns().when_enabled(update_schema));
_feature_listeners.push_back(_feat.cluster_supports_digest_insensitive_to_expiry().when_enabled(update_schema));
_feature_listeners.push_back(_feat.cluster_supports_cdc().when_enabled(update_schema));
_feature_listeners.push_back(_feat.cluster_supports_per_table_partitioners().when_enabled(update_schema));
}
They will call update_schema_version_and_announce() when features are enabled, which does this:
return update_schema_version(proxy, features).then([] (utils::UUID uuid) {
return announce_schema_version(uuid);
});
So it first updates the schema version and then publishes it via
gossip in announce_schema_version(). It is possible that the
announce_schema_version() part of the first schema change will be
deferred and will execute after the other four calls to
update_schema_version_and_announce(). It will install the old schema
version in gossip instead of the more recent one.
The fix is to serialize schema digest calculation and publishing.
Refs #7200
There are 4 places that call this helper:
- storage proxy. Callers are rpc verb handlers and already have the proxy
at hands from which they can get the messaging service instance
- repair. There's local-global messaging instance at hands, and the caller
is in verb handler too
- streaming. The caller is verb handler, which is unregistered on stop, so
the messaging service instance can be captured
- migration manager itself. The caller already uses "this", so the messaging
service instance can be get from it
The better approach would be to make get_schema_definition be the method of
migration_manager, but the manager is stopped for real on shutdown, thus
referencing it from the callers might not be safe and needs revisiting. At
the same time the messaging service is always alive, so using its reference
is safe.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Most of those places are either non-static migration_manager methods.
Plus one place where the local service instance is already at hands.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The local migration manager instance is already available at caller, so
we can call a method on it. This is to facilitate next patching.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The user of this verb is migration manager, so the handler must be it as well.
The hander code now explicitly gets global proxy. This call is safe, as proxy
is not stopped nowadays. In the future we'll need to revisit the relation
between migration - proxy - stats anyway.
The use of local migration manager is safe, as it happens in verb handler which
is unregistered and is waited to be completed on migration manager stop.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
the code pattern looked like:
<collection>.find(<element>) != <collection>.end()
In C++20 the same can be expressed with:
<collection>.contains(<element>)
This is not only more concise but also expresses the intend of the code
more clearly.
This commit replaces all the occurences of the old pattern with the new
approach.
Tests: unit(dev)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>
The schema_tables.hh -> migration_manager.hh couple seems to work as one
of "single header for everyhing" creating big blot for many seemingly
unrelated .hh's.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The seastar api v4 changes the return type of when_all_succeed. This
patch adds discard_result when that is best solution to handle the
change.
This doesn't do the actual update to v4 since there are still a few
issues left to fix in seastar. A patch doing just the update will
follow.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200617233150.918110-1-espindola@scylladb.com>
sync_schema is supposed to make sure that this node knows about all
schema changes known by "nodes" that were made prior to this call.
Currently, when a node is down, the sync is sliently skipped.
To fix, add a flag to migration_task::run_may_throw to indicate that it
should fail if a node is down.
Fixes#4791
sync_schema is supposed to make sure that this node knows about all
schema changes known by "nodes" that were made prior to this call.
Currently, when a node is down, the sync is sliently skipped.
To fix, add a flag to migration_task::run_may_throw to indicate that it
should fail if a node is down.
Fixes#4791
The call for merge_schema_from in some cases is run in the
background and thus is not aborted/waited on shutdown. This
may result in use-after-free one of which is
merge_schema_from
-> read_schema_for_keyspace
-> db::system_keyspace::query
-> storage_proxy::query
-> query_partition_key_range_concurrent
in the latter function the proxy._token_metadata is accessed,
while the respective object can be already free (unlike the
storage_proxy itself that's still leaked on shutdown).
Related bug: #5903, #5999 (cannot reproduce though)
Tests: unit(dev), manual start-stop
dtest(consistency.TestConsistency, dev)
dtest(schema_management, dev)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Reviewed-by: Pekka Enberg <penberg@scylladb.com>
Message-Id: <20200316150348.31118-1-xemul@scylladb.com>
Following commits make it possible to set a specific
partitioner for a table. We want to persist that information
and include it into schema digest. For that a new column
in scylla_tables is needed. This commit adds such column.
We add the new column to scylla_tables because it's a Scylla
specific extension.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
The maybe_schedule_schema_pull waits for schema_tables_v3 to
become available. This is unsafe in case migration manager
goes away before the feature is enabled.
Fix this by subscribing on feature with feature::listener and
waiting for condition variable in maybe_schedule_schema_pull.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The sleep is interrupted with the abort source, the "wait" part
is done with the existing _background_tasks gate. Also we need
to make sure the gate stays alive till the end of the function,
so make use of the async_sharded_service (migration manager is
already such).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The goal is to have token_metadata reference intide the
keyspace_metadata.validate method. This can be acheived
by doing the validation through the database reference
which is "at hands" in migration_manager.
While at it, merge the validation with exists/not-exists
checks done in the same places.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This unties migration_manager from storage_service thus breaking
the circular dependency between these two.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This templates the code for listener_vector, renames it to
atomic_vector and moves it to the utils directory.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Before this patch the iterations over migration_notifier::_listeners
could race with listeners being added and removed.
The addition side is not modified, since it is common to add a
listener during construction and it would require a fairly big
refactoring. Instead, the iteration is modified to use indexes instead
of iterators so that it is still valid if another listener is added
concurrently.
For removal we use a rw lock, since removing an element invalidates
indexes too. There are only a few places that needed refactoring to
handle unregister_listener returning a future<>, so this is probably
OK.
Fixes#5541.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200120192819.136305-1-espindola@scylladb.com>
The factory is purely a state-less thing, there is no difference what
instance of it to use, so we may omit referencing the storage_service
in passive_announce
This is 2nd simple migration_manager -> storage_service link to cut
(more to come later).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are several places where migration_manager needs storage_service
reference to get the database from, thus forming the mutual dependency
between them. This is the simplest case where the migration_manager
link to the storage_service can be cut -- the databse reference can be
obtained from storage_proxy instead.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The _listeners list on migration_manager class and the corresponding
notify_xxx helpers have nothing to do with the its instances, they
are just transport for notification delivery.
At the same time some services need the migration manager to be alive
at their stop time to unregister from it, while the manager itself
may need them for its needs.
The proposal is to move the migration notifier into a complete separate
sharded "service". This service doesn't need anything, so it's started
first and stopped last.
While it's not effectively a "migration" notifier, we inherited the name
from Cassandra and renaming it will "scramble neurons in the old-timers'
brains but will make it easier for newcomers" as Avi says.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Addition of cdc column in scylla_tables changes how schema
digests are calculated, and affect the ABI of schema update
messages (adding a column changes other columns' indexes
in frozen_mutation).
To fix this, extend the schema_tables mechanism with support
for the cdc column, and adjust schemas and mutations to remove
that column when sending schemas during upgrade.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Schema is node-global, update_schema_version_and_announce() updates
all shards. We don't need to recalculate it from every shard, so
install the listeners only on shard 0. Reduces noise in the logs.
Message-Id: <1574872860-27899-1-git-send-email-tgrabiec@scylladb.com>
Merged pull request https://github.com/scylladb/scylla/pull/5366 from Calle Wilund:
Moves schema creation/alter/drop awareness to use new "before" callbacks from
migration manager, and adds/modifies log and streams table as part of the base
table modification.
Makes schema changes semi-atomic per node. While this does not deal with updates
coming in before a schema change has propagated cluster, it now falls into the
same pit as when this happens without CDC.
Added side effect is also that now schemas are transparent across all subsystems,
not just cql.
Patches:
cdc_test: Add small test for altering base schema (add column)
cdc: Handle schema changes via migration manager callbacks
migration_manager: Invoke "before" callbacks for table operations
migration_listener: Add empty base class and "before" callbacks for tables
cql_test_env: Include cdc service in cql tests
cdc: Add sharded service that does nothing.
cdc: Move "options" to separate header to avoid to much header inclusion
cdc: Remove some code from header
Before stopping the db itself, stop the migration service.
It must be stopped before RPC, but RPC is not stopped yet
itself, so we should be safe here.
Here's the tail of the resulting logs:
INFO 2019-11-20 11:22:35,193 [shard 0] init - shutdown migration manager
INFO 2019-11-20 11:22:35,193 [shard 0] migration_manager - stopping migration service
INFO 2019-11-20 11:22:35,193 [shard 1] migration_manager - stopping migration service
INFO 2019-11-20 11:22:35,193 [shard 0] init - Shutdown database started
INFO 2019-11-20 11:22:35,193 [shard 0] init - Shutdown database finished
INFO 2019-11-20 11:22:35,193 [shard 0] init - stopping prometheus API server
INFO 2019-11-20 11:22:35,193 [shard 0] init - Scylla version 666.development-0.20191120.25820980f shutdown complete.
Also -- stop the mm on drain before the commitlog it stopped.
[Tomasz: mm needs the cl because pulling schema changes from other nodes
involves applying them into the database. So cl/db needs to be
stopped after mm is stopped.]
The drain logs would look like
...
INFO 2019-11-25 11:00:40,562 [shard 0] migration_manager - stopping migration service
INFO 2019-11-25 11:00:40,562 [shard 1] migration_manager - stopping migration service
INFO 2019-11-25 11:00:40,563 [shard 0] storage_service - DRAINED:
and then on stop
...
INFO 2019-11-25 11:00:46,427 [shard 0] init - shutdown migration manager
INFO 2019-11-25 11:00:46,427 [shard 0] init - Shutdown database started
INFO 2019-11-25 11:00:46,427 [shard 0] init - Shutdown database finished
INFO 2019-11-25 11:00:46,427 [shard 0] init - stopping prometheus API server
INFO 2019-11-25 11:00:46,427 [shard 0] init - Scylla version 666.development-0.20191125.3eab6cd54 shutdown complete.
Fixes#5300
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20191125080605.7661-1-xemul@scylladb.com>
With this it is possible to create user defined functions and
aggregates and they are saved to disk and the schema change is
propagated.
It is just not possible to call them yet.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
The verbs are:
* DEFINITIONS_UPDATE (push)
* MIGRATION_REQUEST (pull)
Support was added in a backward-compatible way. The push verb, sends
both the old frozen mutation parameter, and the new optional canonical
mutation parameter. It is expected that new nodes will use the latter,
while old nodes will fall-back to the former. The pull verb has a new
optional `options` parameter, which for now contains a single flag:
`remote_supports_canonical_mutation_retval`. This flag, if set, means
that the remote node supports the new canonical mutation return value,
thus the old frozen mutations return value can be left empty.
In preparation to the schema push/pull migrating to use canonical
mutations, convert the method producing the schema mutations to return a
vector of canonical mutations. The only user, MIGRATION_REQUEST verb,
converts the canonical mutations back to frozen mutations. This is very
inefficient, but this path will only be used in mixed clusters. After
all nodes are upgraded the verb will be sending the canonical mutations
directly instead.
"
The warning for discarded futures will only become useful, once we can
silence all present warnings and flip the flag to make it become error.
Then it will start being useful in finding new, accidental discarding of
futures.
This series silences all remaining warnings in the Scylla codebase. For
those cases where it was obvious that the future is discarded on
purpose, the author taking all necessary precaution (handling exception)
the warning was simply silenced by casting the future to void and
adding a relevant comment. Where the discarding seems to have been done
in error, I have fixed the code to not discard it. To the rest of the
sites I added a FIXME to fix the discarding.
"
* 'resolve-discarded-future-warnings/v4.2' of https://github.com/denesb/scylla:
treewide: silence discarded future warnings for questionable discards
treewide: silence discarded future warnings for legit discards
tests: silence discarded future warnings
tests/cql_query_test.cc: convert some tests to thread
This patches silences the remaining discarded future warnings, those
where it cannot be determined with reasonable confidence that this was
indeed the actual intent of the author, or that the discarding of the
future could lead to problems. For all those places a FIXME is added,
with the intent that these will be soon followed-up with an actual fix.
I deliberately haven't fixed any of these, even if the fix seems
trivial. It is too easy to overlook a bad fix mixed in with so many
mechanical changes.
This patch silences those future discard warnings where it is clear that
discarding the future was actually the intent of the original author,
*and* they did the necessary precautions (handling errors). The patch
also adds some trivial error handling (logging the error) in some
places, which were lacking this, but otherwise look ok. No functional
changes.
Introduced in c96ee98.
We call update_schema_version() after features are enabled and we
recalculate the schema version. This method is not updating gossip
though. The node will still use it's database::version() to decide on
syncing, so it will not sync and stay inconsistent in gossip until the
next schema change.
We should call updatE_schema_version_and_announce() instead so that
the gossip state is also updated.
There is no actual schema inconsistency, but the joining node will
think there is and will wait indefinitely. Making a random schema
change would unbock it.
Fixes#4647.
Message-Id: <1566825684-18000-1-git-send-email-tgrabiec@scylladb.com>
The function announce_column_family_drop() drops (deletes) a base table
and all the materialized-views used for its secondary indexes, but not
other materialized views - if there are any, the operation refuses to
continue. This is exactly what CQL's "DROP TABLE" needs, because it is
not allowed to drop a table before manually dropping its views.
But there is no inherent reason why it we can't support an operation
to delete a table and *all* its views - not just those related to indexes.
This patch adds such an option to announce_column_family_drop().
This option is not used by the existing CQL layer, but can be used
by other code automating operations programatically without CQL.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190716150559.11806-1-nyh@scylladb.com>
Schema digest is calculated by querying for mutations of all schema
tables, then compacting them so that all tombstones in them are
dropped. However, even if the mutation becomes empty after compaction,
we still feed its partition key. If the same mutations were compacted
prior to the query, because the tombstones expire, we won't get any
mutation at all and won't feed the partition key. So schema digest
will change once an empty partition of some schema table is compacted
away.
That's not a problem during normal cluster operation because the
tombstones will expire at all nodes at the same time, and schema
digest, although changes, will change to the same value on all nodes
at about the same time.
This fix changes digest calculation to not feed any digest for
partitions which are empty after compaction.
The digest returned by schema_mutations::digest() is left unchanged by
this patch. It affects the table schema version calculation. It's not
changed because the version is calculated on boot, where we don't yet
know all the cluster features. It's possible to fix this but it's more
complicated, so this patch defers that.
Refs #4485.
Asd
After 7c87405, schema sync includes system_schema.view_virtual_columns
in the schema digest. Old nodes don't know about this table and will
not include it in the digest calculation. As a result, there will be
schema disagreement until the whole cluster is upgraded.
Fix this by taking the new table into account only when the whole
cluster is upgraded.
The table should not be used for anything before this happens. This is
not currently enforced, but should be.
Fixes#4457.
There were many calls to read_keyspace_mutation. One in each function
that prepares a mutation for some other schema change.
With this patch they are all moved to a single location.
Tests: unit (dev, debug)
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190328024440.26201-1-espindola@scylladb.com>