Commit Graph

17 Commits

Author SHA1 Message Date
Avi Kivity
3bead8cea0 feature: grandfather PER_TABLE_PARTITIONERS
The PER_TABLE_PARTITIONERS feature was added in 90df9a44ce (2020; 4.0)
and can now be assumed to be always present. We also remove the associated
schema_feature.
2024-05-18 00:15:07 +03:00
Avi Kivity
c7d7ca2c23 feature: grandfather CDC
The CDC feature was made non-experimental in e9072542c1 (2020; 4.4)
and can now be assumed to be always present. We also remove the corresponding
schema_feature.
2024-05-17 20:41:20 +03:00
Avi Kivity
b5f6021a6b feature: grandfather VIEW_VIRTUAL_COLUMNS
The VIEW_VIRTUAL_COLUMNS feature was added in a108df09f9 (2019; 3.1)
and can now be assumed to be always present.

The corresponding schema_feature is removed. Note schema_features are not sent
over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.
2024-05-17 20:41:19 +03:00
Kamil Braun
07984215a3 feature_service: add GROUP0_SCHEMA_VERSIONING feature
This feature, when enabled, will modify how schema versions
are calculated and stored.

- In group 0 mode, schema versions are persisted by the group 0 command
  that performs the schema change, then reused by each node instead of
  being calculated as a digest (hash) by each node independently.
- In RECOVERY mode or before Raft upgrade procedure finishes, when we
  perform a schema change, we revert to the old digest-based way, taking
  into account the possibility of having performed group0-mode schema
  changes (that used persistent versions). As we will see in future
  commits, this will be done by storing additional flags and tombstones
  in system tables.

By "schema versions" we mean both the UUIDs returned from
`schema::version()` and the "global" schema version (the one we gossip
as `application_state::SCHEMA`).

For now, in this commit, the feature is always disabled. Once all
necessary code is setup in following commits, we will enable it together
with Raft.
2023-12-05 13:03:28 +01:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Avi Kivity
35849fc901 Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun"
This reverts commit 3d4398d1b2, reversing
changes made to 45dfce6632. The commit
causes some schema changes to be lost due to incorrect timestamps
in some mutations. More information is available in [1].

Reopens: scylladb/scylladb#7620
Reopens: scylladb/scylladb#13957

Fixes scylladb/scylladb#15530.

[1] https://github.com/scylladb/scylladb/pull/15687
2023-10-11 00:32:05 +03:00
Kamil Braun
72cd457d53 feature_service: add GROUP0_SCHEMA_VERSIONING feature
This feature, when enabled, will modify how schema versions
are calculated and stored.

- In group 0 mode, schema versions are persisted by the group 0 command
  that performs the schema change, then reused by each node instead of
  being calculated as a digest (hash) by each node independently.
- In RECOVERY mode or before Raft upgrade procedure finishes, when we
  perform a schema change, we revert to the old digest-based way, taking
  into account the possibility of having performed group0-mode schema
  changes (that used persistent versions). As we will see in future
  commits, this will be done by storing additional flags and tombstones
  in system tables.

By "schema versions" we mean both the UUIDs returned from
`schema::version()` and the "global" schema version (the one we gossip
as `application_state::SCHEMA`).

For now, in this commit, the feature is always disabled. Once all
necessary code is setup in following commits, we will enable it together
with Raft.
2023-09-15 13:04:04 +02:00
Tomasz Grabiec
f2ed9fcd7e schema_mutations, migration_manager: Ignore empty partitions in per-table digest
Schema digest is calculated by querying for mutations of all schema
tables, then compacting them so that all tombstones in them are
dropped. However, even if the mutation becomes empty after compaction,
we still feed its partition key. If the same mutations were compacted
prior to the query, because the tombstones expire, we won't get any
mutation at all and won't feed the partition key. So schema digest
will change once an empty partition of some schema table is compacted
away.

Tombstones expire 7 days after schema change which introduces them. If
one of the nodes is restarted after that, it will compute a different
table schema digest on boot. This may cause performance problems. When
sending a request from coordinator to replica, the replica needs
schema_ptr of exact schema version request by the coordinator. If it
doesn't know that version, it will request it from the coordinator and
perform a full schema merge. This adds latency to every such request.
Schema versions which are not referenced are currently kept in cache
for only 1 second, so if request flow has low-enough rate, this
situation results in perpetual schema pulls.

After ae8d2a550d, it is more liekly to
run into this situation, because table creation generates tombstones
for all schema tables relevant to the table, even the ones which
will be otherwise empty for the new table (e.g. computed_columns).

This change inroduces a cluster feature which when enabled will change
digest calculation to be insensitive to expiry by ignoring empty
partitions in digest calculation. When the feature is enabled,
schema_ptrs are reloaded so that the window of discrepancy during
transition is short and no rolling restart is required.

A similar problem was fixed for per-node digest calculation in
18f484cc753d17d1e3658bcb5c73ed8f319d32e8. Per-table digest calculation
was not fixed at that time because we didn't persist enabled features
and they were not enabled early-enough on boot for us to depend on
them in digest calculation. Now they are enabled before non-system
tables are loaded so digest calculation can rely on cluster features.

Fixes #4485.
2023-07-03 23:06:55 +02:00
Jadw1
2c46222e31 db,gms: Add SCYLLA_AGGREGATES schema features
This schema feature will be used to guard
system_schema.scylla_aggregates schema table.
2022-07-18 14:18:48 +02:00
Piotr Sarna
120980ac8e db,gms: add SCYLLA_KEYSPACE schema feature
This schema feature will be used to guard the upcoming
system_schema.scylla_keyspaces schema table.
2022-04-08 09:17:00 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Jastrzebski
782f2caf41 schema_features: add PER_TABLE_PARTITIONERS feature
With per table partitioners, partitioner name will be a part
of table schema. To allow rolling upgrade we need to perform
special logic that hides new partitioner name schema column
during the upgrade. This commit adds new schema feature that
controls this logic.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Piotr Jastrzebski
4639989964 cdc: Add CDC_OPTIONS schema_feature
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Piotr Sarna
a0e02df36a service: add computed columns feature
Computed columns feature should be checked before creating
index schemas the new way - by adding computed column names
to system_schema.computed_columns.
2019-07-19 11:58:42 +02:00
Tomasz Grabiec
9de071d214 schema_tables, storage_service: Make schema digest insensitive to expired tombstones in empty partition
Schema digest is calculated by querying for mutations of all schema
tables, then compacting them so that all tombstones in them are
dropped. However, even if the mutation becomes empty after compaction,
we still feed its partition key. If the same mutations were compacted
prior to the query, because the tombstones expire, we won't get any
mutation at all and won't feed the partition key. So schema digest
will change once an empty partition of some schema table is compacted
away.

That's not a problem during normal cluster operation because the
tombstones will expire at all nodes at the same time, and schema
digest, although changes, will change to the same value on all nodes
at about the same time.

This fix changes digest calculation to not feed any digest for
partitions which are empty after compaction.

The digest returned by schema_mutations::digest() is left unchanged by
this patch. It affects the table schema version calculation. It's not
changed because the version is calculated on boot, where we don't yet
know all the cluster features. It's possible to fix this but it's more
complicated, so this patch defers that.

Refs #4485.

Asd
2019-05-14 10:43:06 +02:00
Tomasz Grabiec
0633fcde10 schema: Introduce schema_features 2019-04-28 15:50:12 +02:00