Commit Graph

329 Commits

Author SHA1 Message Date
Michael Litvak
c1f3517a75 view_builder: improve migration to v2 with intermediate phase
Add an intermediate phase to the view builder migration to v2 where we
write to both the old and new table in order to not lose writes during
the migration.
We add an additional view builder version v1_5 between v1 and v2 where
we write to both tables. We perform a barrier before moving to v2 to
ensure all the operations to the old table are completed.
2024-09-05 15:42:35 +03:00
Michael Litvak
f3887cd80b db:system_keyspace: add view_builder_version to scylla_local
Add a new scylla_local parameter view_builder_version, and functions to
read and mutate the value.
The version value defaults to v1 if it doesn't exist in the table.
2024-09-05 15:41:04 +03:00
Michael Litvak
bf4a58bf91 db:system_keyspace: add view_build_status_v2 table
add the table system.view_build_status_v2 with the same schema as
system_distributed.view_build_status.
2024-09-05 15:41:04 +03:00
Laszlo Ersek
9e95f3a198 db/system_keyspace: promise a tagged integer from increment_and_get_generation()
Internally, increment_and_get_generation() produces a
"gms::generation_type" value.

In turn, all callers of increment_and_get_generation() -- namely
scylla_main() [main.cc] and single_node_cql_env::run_in_thread()
[test/lib/cql_test_env.cc] -- pass the resolved value to
storage_service::init_address_map() and storage_service::join_cluster(),
both of which take a "gms::generation_type".

Therefore it is pointless to "untag" the generation value temporarily
between the producer and the consumers. Correct the return type of
increment_and_get_generation().

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-08-14 13:35:08 +02:00
Aleksandra Martyniuk
c64cb98bcf db: node_ops: filter topology request entries
system_keyspace::get_topology_request_entries returns entries for
requests which are running or have finished after specified time.

In task manager node ops task set the time so that they are shown
for task_ttl seconds after they have finished.
2024-07-23 13:35:02 +02:00
Aleksandra Martyniuk
94282b5214 db: service: modify methods to get topology_requests data
Modify get_topology_request_state (and wait_for_topology_request_completion),
so that it doesn't call on_internal_error when request_id isn't
in the topology_requests table if require_entry == false.

Add other methods to get topology request entry.
2024-07-23 13:35:01 +02:00
Marcin Maliszkiewicz
2ab143fb40 db: auth: move auth tables to system keyspace
Separate keyspace which also behaves as system brings
little benefit while creating some compatibility problems
like schema digest mismatch during rollback. So we decided
to move auth tables into system keyspace.

Fixes https://github.com/scylladb/scylladb/issues/18098

Closes scylladb/scylladb#18769
2024-05-26 22:30:42 +03:00
Benny Halevy
1061455442 gossiper: add load_endpoint_state
Pack the topology-related data loaded from system.peers
in `gms::load_endpoint_state`, to be used in a following
patch for `add_saved_endpoint`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-04-14 15:06:56 +03:00
Marcin Maliszkiewicz
562caaf6c6 auth: keep auth version in scylla_local
Before the patch selection of auth version depended
on consistent topology feature but during raft recovery
procedure this feature is disabled so we need to persist
the version somewhere to not switch back to v1 as this
is not supported.

During recovery auth works in read-only mode, writes
will fail.
2024-04-02 19:04:21 +02:00
Michał Jadwiszczak
dab909b1d1 service:qos: store information about sl data migration
Save information whether service levels data was migrated to v2 table.
The information is stored in `system.scylla_local` table. It's
written with raft command and included in raft snapshot.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
8e242f5acd db:system_keyspace: add SERVICE_LEVELS_V2 table
The table has the same schema as `system_distributed.service_levels`.
However it's created entirely at once (unlike old table which creates
base table first and then it adds other columns) because `system` tables
are local to the node.
2024-03-21 23:14:57 +01:00
Avi Kivity
e48eb76f61 sstables_manager: decouple from system_keyspace
sstables_manager now depends on system_keyspace for access to the
system.sstables table, needed by object storage. This violates
modularity, since sstables_manager is a relatively low-level leaf
module while system_keyspace integrates large parts of the system
(including, indirectly, sstables_manager).

One area where this is grating is sstables::test_env, which has
to include the much higher level cql_test_env to accommodate it.

Fix this by having sstables_manager expose its dependency on
system_keyspace as an interface, sstables_registry, and have
system_keyspace implement the glue logic in
system_keyspace_sstables_manager.

Closes scylladb/scylladb#17868
2024-03-18 20:38:07 +03:00
Tomasz Grabiec
61b3453552 raft topology: Keep nodes in the left state to topology
Those nodes will be kept in tablet replica sets for a while after node
replace is done, until the new replica is rebuilt. So we need to know
about those node's location (dc, rack) for two reasons:

 1) algorithms which work with replica sets filter nodes based on
 their location. For example materialized views code which pairs base
 replicas with view replicas filters by datacenter first.

 2) tablet scheduler needs to identify each node's location in order
 to make decisions about new replica placement.

It's ok to not know the IP, and we don't keep it. Those nodes will not
be present in the IP-based replica sets, e.g. those returned by
get_natural_endpoints(), only in host_id-based replica
sets. storage_proxy request coordination is not affected.

Nodes in the left state are still not present in token ring, and not
considered to be members of the ring (datacanter endpoints excludes them).

In the future we could make the change even more transparent by only
loading locator::node* for those nodes and keeping node* in tablet
replica sets.

We load topology infromation only for left nodes which are actually
referenced by any tablet. To achieve that, topology loading code
queries system.tablet for the set of hosts. This set is then passed to
system.topology loading method which decides whether to load
replica_state for a left node or not.
2024-03-15 11:05:29 +01:00
Marcin Maliszkiewicz
ebb0ffeb6c auth: implement auth-v2 migration
During raft topology upgrade procedure data from
system_auth keyspace will be migrated to system_auth_v2.

Migration works mostly on top of CQL layer to minimize
amount of new code introduced, it mostly executes SELECTs
on old tables and then INSERTs on new tables. Writes are
not executed as usual but rather announced via raft.
2024-03-01 16:25:14 +01:00
Marcin Maliszkiewicz
5a6d4dbc37 storage_service: add support for auth-v2 raft snapshots
This patch adds new RPC for pulling snapshot of auth tables.
2024-03-01 16:25:14 +01:00
Patryk Jędrzejczak
b8aa74f539 raft topology: clean only obsolete CDC generations' data
Currently, we may clear a CDC generation's data from
CDC_GENERATIONS_V3 if it is not the last committed generation
and it is at least 24 hours old (according to the topology
coordinator's clock). However, after allowing writes to the
previous CDC generations, this condition became incorrect. We
might clear data of a generation that could still be written to.

The new solution is to clear data of the generations that
finished operating more than 24 hours ago. The rationale behind
it is in the new comment in
`topology_coordinator:clean_obsolete_cdc_generations`.

The previous solution used the clean-up candidate. After
introducing `committed_cdc_generations`, it became unneeded.
The last obsolete generation can be computed in
`topology_coordinator:clean_obsolete_cdc_generations`. Therefore,
we remove all the code that handles the clean-up candidate.

After changing how we clear CDC generations' data,
`test_current_cdc_generation_is_not_removed` became obsolete.
The tested feature is not present in the code anymore.

`test_dependency_on_timestamps` became the only test case covering
the CDC generation's data clearing. We adjust it after the changes.
2024-02-20 12:35:18 +01:00
Petr Gusev
86410d71d1 system_keyspace::update_peer_info: make host_id an explicit parameter
The host_id field should always be set, so it's more
appropriate to pass it as a separate parameter.

The function storage_service::get_peer_info_for_update
is  updated. It shouldn't look for host_id app
state is the passed map, instead the callers should
get the host_id on their own.
2024-02-15 13:21:04 +04:00
Petr Gusev
4a14988735 system_keyspace: remove duplicate ips for host_id
When a node changes IP we call sync_raft_topology_nodes
from raft_ip_address_updater::on_endpoint_change with
the old IP value in prev_ip parameter.
It's possible that the nodes crashes right after
we insert a new IP for the host_id, but before we
remove the old IP. In this commit we fix the
possible inconsistency by removing the system.peers
record with old timestamp. This is what the new
peers_table_read_fixup function is responsible for.

We call this function in all system_keyspace methods
that read the system.peers table. The function
loads the table in memory, decides if some rows
are stale by comparing their timestamps and
removes them.

The new function also removes the records with no
host_id, so we no longer need the get_host_id function.

We'll add a test for the problem this commit fixes
in the next commit.
2024-02-15 13:21:04 +04:00
Piotr Dulikowski
fb02453686 system_keyspace: add read_cdc_generation_opt
The `system_keyspace::read_cdc_generation` loads a cdc generation from
the system tables. One of its preconditions is that the generation
exists - this precondition is quite easy to satisfy in raft mode, and
the function was designed to be used solely in that mode.

In legacy mode however, in case when we revert from raft mode through
recovery, it might be necessary to use generations created in raft mode
for some time. In order to make the function useful as a fallback in
case lookup of a generation in legacy mode fails, introduce a relaxed
variant of `read_cdc_generation` which returns std::nullopt if the
generation does not exist.
2024-02-08 19:12:28 +01:00
Piotr Dulikowski
3513a07d8a feature_service: fall back to checking legacy features on startup
When checking features on startup (i.e. whether support for any feature
was revoked in an unsafe way), it might happen that upgrade to raft
topology didn't finish yet. In that case, instead of loading an empty
set of features - which supposedly represents the set of features that
were enabled until last boot - we should fall back to loading the set
from the legacy `enabled_features` key in `system.scylla_local`.
2024-02-08 19:12:28 +01:00
Michał Chojnowski
7c5a8894be db: system_keyspace: add system.commitlog_cleanups
Add a system table which will hold records of cleanup operations,
for the purpose of filtering commitlog replays to avoid data
resurrections.
2024-01-24 10:37:38 +01:00
Gleb Natapov
84197ff735 storage_service: topology coordinator: check topology operation completion using status in topology_requests table
Instead of trying to guess if a request completed by looking into the
topology state (which is sometimes can be error prone) look at the
request status in the new topology_requests. If request failed report
a reason for the failure from the table.
2024-01-16 17:02:54 +02:00
Gleb Natapov
ecb8778950 system keyspace: introduce local table to store topology requests status
The table has the following schema and will be managed by raft:

CREATE TABLE system.topology_requests (
    id timeuuid PRIMARY KEY,

    initiating_host uuid,
    start_time timestamp,

    done boolean,
    error text,
    end_time timestamp,
);

In case of an request completing with an error the "error" filed will be non empty when "done" is set to true.
2024-01-16 13:57:16 +02:00
Kefu Chai
be364d30fd db: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16664
2024-01-09 11:44:19 +02:00
Benny Halevy
c520fc23f0 system_keyspace: update_peer_info: drop single-column overloads
They are no longer used.
Instead, all callers now pass peer_info.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
74159bb5ae system_keyspace: drop update_tokens(endpoint, tokens) overload
It is unused now after the previous patch
to update_peer_info in one call.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:34 +02:00
Benny Halevy
b2735d47f7 system_keyspace: update_peer_info: use struct peer_info for all optional values
Define struct peer_info holding optional values
for all system.peers columns, allowing the caller to
update any column.

Pass the values as std::vector<std::optional<data_value>>
to query_processor::execute_internal.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:37:30 +02:00
Benny Halevy
328ce23c78 types: add data_value_list
data_value_list is a wrapper around std::initializer_list<data_value>.
Use it for passing values to `cql3::query_processor::execute_internal`
and friends.

A following path will add a std::variant for data_value_or_unset
and extend data_value_list to support unset values.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 18:17:27 +02:00
Benny Halevy
85b3232086 system_keyspace: get rid of update_cached_values
It's a no-op.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-31 10:10:51 +02:00
Avi Kivity
2853f79f96 virtual_tables: scope virtual tables registry in system_keyspace
Virtual tables are kept in a thread_local registry for deduplication
purposes. The problem is that thread_local variables are destroyed late,
possibly after the schema registry and the reactor are destroyed.
Currently this isn't a problem, but after a seastar change to
destroy the reactor after termination [1], things break.

Fix by moving the registry to system_keyspace. system_keyspace was chosen
since it was the birthplace of virtual tables.

Pimpl is used to avoid increasing dependencies.

[1] 101b245ed7
2023-12-21 16:19:42 +02:00
Kamil Braun
6a4106edf3 migration_manager: don't attach empty system.scylla_local mutation in migration request handler
In effb9fb3cb migration request handler
(called when a node requests schema pull) was extended with a
`system.scylla_local` mutation:
```
        cm.emplace_back(co_await self._sys_ks.local().get_group0_schema_version());
```

This mutation is empty if the GROUP0_SCHEMA_VERSIONING feature is
disabled.

Nevertheless, it turned out to cause problems during upgrades.
The following scenario shows the problem:

We upgrade from 5.2 to enterprise version with the aforementioned patch.
In 5.2, `system.scylla_local` does not use schema commitlog.
After the first node upgrades to the enterprise version, it immediately
on boot creates a new enterprise-only table
(`system_replicated_keys.encrypted_keys`) -- the specific table is not
important, only the fact that a schema change is performed.
This happens before the restarting node notices other nodes being UP, so
the schema change is not immediately pushed to the other nodes.
Instead, soon after boot, the other non-upgraded nodes pull the schema
from the upgraded node.
The upgraded node attaches a `system.scylla_local` mutation to the
vector of returned mutations.
The non-upgraded nodes try to apply this vector of mutations. Because
some of these mutations are for tables that already use schema
commitlog, while the `system.scylla_local` table does not use schema
commitlog, this triggers the following error (even though the mutation
is empty):
```
    Cannot apply atomically across commitlog domains: system.scylla_local, system_schema.keyspaces
```

Fortunately, the fix is simple -- instead of attaching an empty
mutation, do not attach a mutation at all if the handler of migration
request notices that group0_schema_version is not present.

Note that group0_schema_version is only present if the
GROUP0_SCHEMA_VERSIONING feature is enabled, which happens only after
the whole upgrade finishes.

Refs: scylladb/scylladb#16414

Not using "Fixes" because the issue will only be fixed once this PR is
merged to `master` and the commit is cherry-picked onto next-enterprise.

Closes scylladb/scylladb#16416
2023-12-14 22:58:13 +01:00
Tomasz Grabiec
effb9fb3cb Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun
When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620).

If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957).

When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary.

We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`.

Fixes: #7620
Fixes: #13957

---

This is a reincarnation of PR scylladb/scylladb#15331. The previous PR was reverted due to a bug it unmasked; the bug has now been fixed (scylladb/scylladb#16139). Some refactors from the previous PR were already merged separately, so this one is a bit smaller.

I have checked with @Lorak-mmk's reproducer (https://github.com/Lorak-mmk/udt_schema_change_reproducer -- many thanks for it!) that the originally exposed bug is no longer reproducing on this PR, and that it can still be reproduced if I revert the aforementioned fix on top of this PR.

Closes scylladb/scylladb#16242

* github.com:scylladb/scylladb:
  docs: describe group 0 schema versioning in raft docs
  test: add test for group 0 schema versioning
  feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode
  schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0
  migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations
  schema_tables: use schema version from group 0 if present
  migration_manager: store `group0_schema_version` in `scylla_local` during schema changes
  system_keyspace: make `get/set_scylla_local_param` public
  feature_service: add `GROUP0_SCHEMA_VERSIONING` feature
2023-12-11 12:17:57 +01:00
Kamil Braun
3db8ac80cb migration_manager: store group0_schema_version in scylla_local during schema changes
We extend schema mutations with an additional mutation to the
`system.scylla_local` table which:
- in Raft mode, stores a UUID under the `group0_schema_version` key.
- outside Raft mode, stores a tombstone under that key.

As we will see in later commits, nodes will use this after applying
schema mutations. If the key is absent or has a tombstone, they'll
calculate the global schema digest on their own -- using the old way. If
the key is present, they'll take the schema version from there.

The Raft-mode schema version is equal to the group 0 state ID of this
schema command.

The tombstone is necessary for the case of performing a schema change in
RECOVERY mode. It will force a revert to the old digest-based way.

Note that extending schema mutations with a `system.scylla_local`
mutation is possible thanks to earlier commits which moved
`system.scylla_local` to schema commitlog, so all mutations in the
schema mutations vector still go to the same commitlog domain.

Also, since we introduce a replicated tombstone to
`system.scylla_local`, we need to set GC grace to nonzero. We set it to
`schema_gc_grace`, which makes sense given the use case.
2023-12-08 17:45:41 +01:00
Kamil Braun
1763c65662 system_keyspace: make get/set_scylla_local_param public
We'll use it outside `system_keyspace` code in later commit.
2023-12-05 13:03:29 +01:00
Benny Halevy
6c00c9a45d raft: use locator::topology/messaging rather than fb_utilities
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-05 13:26:46 +02:00
Benny Halevy
4bb4d673c3 db/system_keyspace: save_local_info: get broadcast addresses from caller
So not to rely on fb_utilities.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-05 08:42:49 +02:00
Kamil Braun
8a14839a00 Merge 'handle more failures during topology operations' from Gleb
This series adds handling for more failures during a topology operation
(we already handle a failure during streaming). Here we add handling of
tablet draining errors by aborting the operation and handling of errors
after streaming where an operation cannot be aborted any longer. If the
error happens when rollback is no longer possible we wait for ring delay
and proceed to the next step. Each individual patch that adds the sleep
has an explanation what the consequences of the patch are.

* 'gleb/topology-coordinator-failures' of github.com:scylladb/scylla-dev:
  test: add test to check errro handling during tablet draining
  test: fix test_topology_streaming_failure test to not grep the whole file
  storage_service: add error injection into the tablet migration code
  storage_service: topology coordinator: rollback on handle_tablet_migration failure during tablet_draining stage
  storage_service: topology coordinator: do not retry the metadata barrier forever in write_both_read_new state
  storage_service: topology coordinator: do not retry the metadata barrier forever in left_token_ring state
  storage_service: topology coordinator: return a node that is being removed from get_excluded_nodes
  storage_service: topology_coordinator: use new rollback_to_normal state in the rollback procedure
  storage_service: topology coordinator: add rollback_to_normal node state
  storage_service: topology coordinator: put fence version into the raft state
  storage_service: topology coordinator: do fencing even if draining failed
2023-11-29 19:02:35 +01:00
Kamil Braun
de3607810d system_keyspace: fix outdated comment 2023-11-23 14:06:27 +01:00
Gleb Natapov
6edbf4b663 storage_service: topology coordinator: put fence version into the raft state
Currently when the coordinator decides to move the fence it issues an
RPC to each node and each node locally advances fence version. This is
fine if there are no failures or failures are handled by retrying
fencing, but if we want to allow topology changes to progress even in
the presence of barrier failures it is easier to store the fence version
in the raft state. The nodes that missed fence rpc may easily catch up
to the latest fence version by simply executing a raft barrier.
2023-11-19 15:28:08 +02:00
Pavel Emelyanov
d827068d01 sstables,s3: Support state change (without generation change)
Now when the system.sstables has the state field, it can be changed
(UPDATEd). However, when changing the state AND generation, this still
won't work, because generation is the clustering key of the table in
question and cannot be just changed. This, nonetheless, is OK, as
generation changes with state only when moving an sstable from upload
dir into normal/staging and this is separate issue for S3 (#13018). For
now changing state only is OK.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-24 19:12:37 +03:00
Pavel Emelyanov
ca5d3d217f system_keyspace: Add state field to system.sstables
The state is one of <empty>(normal)/staging/quarantine. Currently when
sstable is moved to non-normal state the s3 backend state_change() call
throws thus such sstables do not appear. Next patches are going to
change that and the new field in the system.sstables is needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-10-24 19:12:37 +03:00
Kefu Chai
b36cef6f1a sstable: remove _remote_prefix from s3_storage
since we use the sstable.generation() for the remote prefix of
the key of the object for storing the sstable component, there is
no need to set remote_prefix beforehand.

since `s3_storage::ensure_remote_prefix()` and
`system_kesypace::sstables_registry_lookup_entry()` are not used
anymore, they are removed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-23 10:08:22 +08:00
Kefu Chai
af8bc8ba63 sstable: switch to uuid identifier for naming S3 sstable objects
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.

to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.

so, in this change:

1. instead of creating a new UUID, just reuse the generation of the
   sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
   we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
   by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
   base36-based one (implemented by
   fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
   enabled in the `test_env_config`, so that it the sstable manager
   can enable the uuid-based uuid when creating a new uuid for
   sstable.
5. throw if the generation of sstable is not UUID-based when
   accessing / manipulating an sstable with S3 storage backend. as
   the S3 storage backend now relies on this option. as, otherwise
   we'd have sstables with key like s3://bucket/number/basename, which
   is just unable to serve as a unique id for sstable if the bucket is
   shared across multiple tables.

Fixes #14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-23 10:08:22 +08:00
Calle Wilund
6fbd210679 system.truncated: Remove replay_position data from truncated on start
Once we've started clean, and all replaying is done, truncation logs
commit log regarding replay positions are invalid. We should exorcise
them as soon as possible. Note that we cannot remove truncation data
completely though, since the time stamps stored are used by things like
batch log to determine if it should use or discard old batch data.
2023-10-17 18:16:48 +04:00
Tomasz Grabiec
0aef0f900b Merge 'truncation records refactorings' from Petr Gusev
This PR contains several refactoring, related to truncation records handling in `system_keyspace`, `commitlog_replayer` and `table` clases:
* drop map_reduce from `commitlog_replayer`, it's sufficient to load truncation records from the null shard;
* add a check that `table::_truncated_at` is properly initialized before it's accessed;
* move its initialization after `init_non_system_keyspaces`

Closes scylladb/scylladb#15583

* github.com:scylladb/scylladb:
  system_keyspace: drop truncation_record
  system_keyspace: remove get_truncated_at method
  table: get_truncation_time: check _truncated_at is initialized
  database: add_column_family: initialize truncation_time for new tables
  database: add_column_family: rename readonly parameter to is_new
  system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace
  commitlog_replayer: refactor commitlog_replayer::impl::init
  system_keyspace: drop redundant typedef
  system_keyspace: drop redundant save_truncation_record overload
  table: rename cache_truncation_record -> set_truncation_time
  system_keyspace: get_truncated_position -> get_truncated_positions
2023-10-17 10:55:30 +02:00
Avi Kivity
35849fc901 Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun"
This reverts commit 3d4398d1b2, reversing
changes made to 45dfce6632. The commit
causes some schema changes to be lost due to incorrect timestamps
in some mutations. More information is available in [1].

Reopens: scylladb/scylladb#7620
Reopens: scylladb/scylladb#13957

Fixes scylladb/scylladb#15530.

[1] https://github.com/scylladb/scylladb/pull/15687
2023-10-11 00:32:05 +03:00
Petr Gusev
a6087a10bd system_keyspace: drop truncation_record
This is a refactoring commit without observable
changes in behaviour.

The only usage was in get_truncation_records
method which can be inlined.
2023-10-05 15:19:59 +04:00
Petr Gusev
9d350e7532 system_keyspace: remove get_truncated_at method
The only usage is in batchlog_manager, and it
can be replaced with cf.get_truncation_time().

std::optional<std::reference_wrapper<canonical_mutation>>
is replaced with canonical_mutation* since it is
semantically the same but with less type boilerplate.
2023-10-05 15:19:59 +04:00
Petr Gusev
b70bca71bc system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace
load_truncation_times() now works only for
schema tables since the rest is not loaded
until distributed_loader::init_non_system_keyspaces.
An attempt to call cf.set_truncation_time
for non-system table just throws an exception,
which is caught and logged with debug level.
This means that the call cf.get_truncation_time in
paxos_state.cc has never worked as expected.

To fix that we move load_truncation_times()
closer to the point where the tables are loaded.
The function distributed_loader::populate_keyspace is
called for both system and non-system tables. Once
the tables are loaded, we use the 'truncated' table
to initialize _truncated_at field for them.

The truncation_time check for schema tables is also moved
into populate_keyspace since is seems like a more natural
place for it.
2023-10-05 15:19:52 +04:00
Petr Gusev
c94946d566 system_keyspace: drop redundant typedef 2023-10-03 17:11:40 +04:00