Commit Graph

523 Commits

Author SHA1 Message Date
Avi Kivity
d69bf4f010 cql3: introduce dialect infrastructure
A dialect is a different way to interpret the same CQL statement.

Examples:
 - how duplicate bind variable names are handled (later in this series)
 - whether `column = NULL` in LWT can return true (as is now) or
   whether it always returns NULL (as in SQL)

Currently, dialect is an empty structure and will be filled in later.
It is passed to query_processor methods that also accept a CQL string,
and from there to the parser. It is part of the prepared statement cache
key, so that if the dialect is changed online, previous parses of the
statement are ignored and the statement is prepared again.

The patch is careful to pick up the dialect at the entry point (e.g.
CQL protocol server) so that the dialect doesn't change while a statement
is parsed, prepared, and cached.
2024-08-29 21:19:23 +03:00
Benny Halevy
52234214e5 schema_tables: calculate_schema_digest: filter the key earlier
Currently, each frozen mutation we get from
system_keyspace::query_mutations is unfrozen in whole
to a mutation and only then we check its key with
the provided `accept_keyspace` function.

This is wasteful, since they key can be processed
directly form the frozen mutation, before taking
the toll of unfreezing it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-15 12:33:34 +03:00
Benny Halevy
95a5fba0ea schema_tables: calculate_schema_digest: prevent stalls due to large mutations vector
With a large number of table the schema mutations
vector might get big enoug to cause reactor stalls
when freed.

For example, the following stall was hit on
2023.1.0~rc1-20230208.fe3cc281ec73 with 5000 tables:
```
 (inlined by) ~vector at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_vector.h:730
 (inlined by) db::schema_tables::calculate_schema_digest(seastar::sharded<service::storage_proxy>&, enum_set<super_enum<db::schema_feature, (db::schema_feature)0, (db::schema_feature)1, (db::schema_feature)2, (db::schema_feature)3, (db::schema_feature)4, (db::schema_feature)5, (db::schema_feature)6, (db::schema_feature)7> >, seastar::noncopyable_function<bool (std::basic_string_view<char, std::char_traits<char> >)>) at ./db/schema_tables.cc:799
```

This change returns a mutations generator from
the `map` lambda coroutine so we can process them
one at a time, destroy the mutations one at a time,
and by that, reducing memory footprint and preventing
reactor stalls.

Fixes #18173

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-15 12:33:34 +03:00
Botond Dénes
b886ed44a7 db/schema_tables: pass tablet hint to update_tablet_metadata()
Replace the has_tablet_mutations in `merge_tables_and_views()` with a
hint parameter, which is calculated in the caller, from the original
schema change mutations. This hint is then forwarded to the notifier's
`update_tablet_metadata()` so that subscribers can refresh only the
tablet partitions that changed.
2024-08-11 09:53:19 -04:00
Botond Dénes
2cec0d8dd1 service/migration_listener: update_tablet_metadata(): add hint parameter
The hint contains information related to what exactly changed, allowing
listeners to do partial updates, instead of reloading all metadata on
each notification.
2024-08-11 09:53:19 -04:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Marcin Maliszkiewicz
32fe101f9d db: replica: batch functions schema modifications
Before each function change was immediately visible as
during event notification logic yielded.

Now we first gather the modifications and then commit them.

Further work will broaden the scope of atomicity to the whole
schema and even across other subsystems.
2024-07-04 13:10:26 +02:00
Marcin Maliszkiewicz
16b770ff1a cql3: functions: make functions class non-static
This is done to ease code reuse in the following commit.
It'd also help should we ever want properly mount functions
class to schema object instead of static storage.
2024-07-04 10:24:57 +02:00
Benny Halevy
b7f00ba4bf schema_tables: remove dead code
Well, even after 10 years, the c++ compilers still
do not compile Java...

And having that legacy code laying around
not only it doesn't help anyone understand what's
going on, but on the contrary, it's confusing and distracting.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-06-27 20:34:02 +03:00
Nadav Har'El
4faceeaa33 Merge 'treewide: drop thrift support' from Kefu Chai
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is preserved for backward compatibility, as we could load from an existing system.local table which still contains this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for backward compatibility with java-based nodetool.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

- [x] not a fix, no need to backport

Closes scylladb/scylladb#18453

* github.com:scylladb/scylladb:
  config: expand on rpc_keepalive's description
  api: s/rpc/thrift/
  db/system_keyspace: drop thrift_version from system.local table
  transport: do not return client_type from cql_server::connection::make_client_key()
  treewide: drop thrift support
2024-06-17 22:36:49 +03:00
Pavel Emelyanov
4400f9082e lang: Return context as future, not via reference argument
Commit 882b2f4e9f (cql3, schema_tables: Generalize function creation)
erroneously says that optional<context> is not suitable for future<>
type, but in fact it is.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#19204
2024-06-12 16:54:46 +03:00
Pavel Emelyanov
b0ffc03599 lang, cql3, schema_tables: Don't mess with db::config
Not function context creation is encapsulated in lang::manager so it's
possible to patch-out few more places that use database as config
provider.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
882b2f4e9f cql3, schema_tables: Generalize function creation
When a function is created with the CREATE FUNCTION statement, the
statement handler does all the necessary preparations on its own. The
very same code exists in schema_tables, when the function is loaded on
boot. This patch generalizes both and keeps function language-specific
context creation inside lang/ code.

The creation function returns context via argument reference. It would
have been nicer if it was returned via future<>, but it's not suitable
for future<T> type :(

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
1dec79e97d lang: Move wasm::manager to its .cc/.hh files
It's going to become a facade in front of both -- wasm and lua, so keep
it in files with language independent names.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Kefu Chai
ad649be1bf treewide: drop thrift support
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
  preserved for backward compatibility, as we could load
  from an existing system.local table which still contains
  this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
  backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
  they are marked as "Unused". so that the new release
  of scylladb can consume existing scylla.yaml configurations
  which might contain these settings. by making them
  deprecated, user will be able get warned, and update
  their configurations before we actually remove them
  in the next major release.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 06:44:59 +08:00
Gleb Natapov
1faef47952 schema_tables: remove unused code 2024-05-29 11:30:24 +03:00
Marcin Maliszkiewicz
2ab143fb40 db: auth: move auth tables to system keyspace
Separate keyspace which also behaves as system brings
little benefit while creating some compatibility problems
like schema digest mismatch during rollback. So we decided
to move auth tables into system keyspace.

Fixes https://github.com/scylladb/scylladb/issues/18098

Closes scylladb/scylladb#18769
2024-05-26 22:30:42 +03:00
Kefu Chai
dbfdc71d2d treewide: fix typos in comment and error messages
these typos were identified by codespell

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18868
2024-05-26 11:54:36 +03:00
Avi Kivity
61505d057e Merge 'Sort user-defined types in describe statements' from Michał Jadwiszczak
User-defined types can depend on each other, creating directed acyclic graph.

In order to support restoring schema from `DESC SCHEMA`, UDTs should be
ordered topologically, not alphabetically as it was till now.

This patch changes the way UDTs are ordered in `DESC SCHEMA`/`DESC KEYSPACE <ks>` statements, so the output can be safely copy-pasted to restore the schema.

Fixes #18539

Closes scylladb/scylladb#18302

* github.com:scylladb/scylladb:
  test/cql-pytest/test_describe: add test for UDTs ordering
  cql3/statements/describe_statement: UDTs topological sorting
  cql3/statements/describe_statement: allow to skip alphabetical sorting
  types: add a method to get all referenced user types
  db/cql_type_parser: use generic topological sorting
  db/cql_type_parses: futurize raw_builder::build()
  test/boost: add test for topological sorting
  utils: introduce generic topological sorting algorithm
2024-05-20 16:58:17 +03:00
Avi Kivity
54a82fed6b feature, index: grandfather CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX
This feature corrected how we store the token in secondary indexes. It
was introduced in 7ff72b0ba5 (2020; 4.4) and can now be assumed present
everywhere. Note that we still support indexes created with the old format.
2024-05-18 00:24:11 +03:00
Avi Kivity
3bead8cea0 feature: grandfather PER_TABLE_PARTITIONERS
The PER_TABLE_PARTITIONERS feature was added in 90df9a44ce (2020; 4.0)
and can now be assumed to be always present. We also remove the associated
schema_feature.
2024-05-18 00:15:07 +03:00
Avi Kivity
c7d7ca2c23 feature: grandfather CDC
The CDC feature was made non-experimental in e9072542c1 (2020; 4.4)
and can now be assumed to be always present. We also remove the corresponding
schema_feature.
2024-05-17 20:41:20 +03:00
Avi Kivity
b5f6021a6b feature: grandfather VIEW_VIRTUAL_COLUMNS
The VIEW_VIRTUAL_COLUMNS feature was added in a108df09f9 (2019; 3.1)
and can now be assumed to be always present.

The corresponding schema_feature is removed. Note schema_features are not sent
over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.
2024-05-17 20:41:19 +03:00
Michał Jadwiszczak
3830f3bd23 db/cql_type_parses: futurize raw_builder::build()
In order to use generic topological sort,
build() method needs to return future.
2024-05-16 13:30:03 +02:00
Benny Halevy
7f372dd9ae schema_tables: convert_schema_to_mutations: make_canonical_mutation_gently
To prevent stalls due to large schema mutations.
While at it, reserve the result canonical_mutation vector.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-05-02 19:37:05 +03:00
Benny Halevy
61dea98185 schema_tables: redact_columns_for_missing_features: get input mutation using rvalue reference
The function upgrades the input mutation
only in certain cases.  Currently it accepts
the input mutation by value, which may cause
and extraneous copy if the caller doesn't move
the mutation, as done in
`adjust_schema_for_schema_features`.

Getting an rvalue reference instead makes the
interface clearer.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-05-02 19:37:05 +03:00
Benny Halevy
a126160d7e frozen_mutation: move unfreeze_gently to async_utils
Unfreeze_gently doesn't have to be a method of
frozen_mutation.  It might as well be implemented as
a free function reading from a frozen_mutation
and preparing a mutation gently.

The logic will be used in a later patch
to make a canonical mutation directly from
a frozen_mutation instead of unfreezing it
and then converting it to a canonical_mutation.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-05-02 19:27:56 +03:00
Kamil Braun
d8313dda43 Merge 'db: config: move consistent-topology-changes out of experimental and make it the default for new clusters' from Patryk Jędrzejczak
We move consistent cluster management out of experimental and
make it the default for new clusters in 6.0. In code, we make the
`consistent-topology-changes` flag unused and assumed to be true.

In 6.0, the topology upgrade procedure will be manual and
voluntary, so some clusters will still be using the gossip-based
topology even though they support the raft-based topology.
Therefore, we need to continue testing the gossip-based topology.
This is possible by using the `force-gossip-topology-changes` flag
introduced in scylladb/scylladb#18284.

Ref scylladb/scylladb#17802

Closes scylladb/scylladb#18285

* github.com:scylladb/scylladb:
  docs: raft.rst: update after removing consistent-topology-changes
  treewide: fix indentation after the previous patch
  db: config: make consistent-topology-changes unused
  test: lib: single_node_cql_env: restart a node in noninitial run_in_thread calls
  test: test_read_required_hosts: run with force-gossip-topology-changes
  storage_service: join_cluster: replace force_gossip_based_join with force-gossip-topology-changes
  storage_service: join_token_ring: fix finish_setup_after_join calls
2024-04-26 14:45:29 +02:00
Patryk Jędrzejczak
0d428a3857 treewide: fix indentation after the previous patch 2024-04-25 14:33:21 +02:00
Patryk Jędrzejczak
3a34bb18cd db: config: make consistent-topology-changes unused
We make the `consistent-topology-changes` experimental feature
unused and assumed to be true in 6.0. We remove code branches that
executed if `consistent-topology-changes` was disabled.
2024-04-25 14:33:21 +02:00
Kefu Chai
c323c93fa4 treewide: remove {dclocal_,}read_repair_chance options
dclocal_read_repair_chance and read_repair_chance have been removed
in Cassandra 3.11 and 4.x, see
https://issues.apache.org/jira/browse/CASSANDRA-13910.
if we expose the properties via DDL, Cassandra would fails to consume
the CQL statement to creating the table when performing migration
from Scylla to Cassandra 4.x, as the latter does not understand
these properties anymore.

currently the default values of `dc_local_read_repair_chance` and
`read_repair_chance` are both "0". so this is practically disabled,
unless user deliberately set them to a value greater than 0.

also, as a side effect, Cassandra 4.x has better support of
Python3. the cqlsh shipped along with Cassandra 3.11.16 only
supports python2.7, see
https://github.com/apache/cassandra/blob/cassandra-3.11.16/bin/cqlsh.py
it errors out if the system only provides python3 with the error
of

```
No appropriate python interpreter found.
```
but modern linux systems do not provide python2 anymore.

so, in this change, we deprecate these two options.

Fixes #3502
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-25 17:15:27 +08:00
Kefu Chai
372a4d1b79 treewide: do not define FMT_DEPRECATED_OSTREAM
since we do not rely on FMT_DEPRECATED_OSTREAM to define the
fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`.

in this change,

* utils: drop the range formatters in to_string.hh and to_string.c, as
  we don't use them anymore. and the tests for them in
  test/boost/string_format_test.cc are removed accordingly.
* utils: use fmt to print chunk_vector and small_vector. as
  we are not able to print the elements using operator<< anymore
  after switching to {fmt} formatters.
* test/boost: specialize fmt::details::is_std_string_like<bytes>
  due to a bug in {fmt} v9, {fmt} fails to format a range whose
  element type is `basic_sstring<uint8_t>`, as it considers it
  as a string-like type, but `basic_sstring<uint8_t>`'s char type
  is signed char, not char. this issue does not exist in {fmt} v10,
  so, in this change, we add a workaround to explicitly specialize
  the type trait to assure that {fmt} format this type using its
  `fmt::formatter` specialization instead of trying to format it
  as a string. also, {fmt}'s generic ranges formatter calls the
  pair formatter's `set_brackets()` and `set_separator()` methods
  when printing the range, but operator<< based formatter does not
  provide these method, we have to include this change in the change
  switching to {fmt}, otherwise the change specializing
  `fmt::details::is_std_string_like<bytes>` won't compile.
* test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends
  for comparing values. but without the operator<< based formatters,
  Boost.Test would not be able to print them. after removing
  the homebrew formatters, we need to use the generic
  `boost_test_print_type()` helper to do this job. so we are
  including `test_utils.hh` in tests so that we can print
  the formattable types.
* treewide: add "#include "utils/to_string.hh" where
  `fmt::formatter<optional<>>` is used.
* configure.py: do not define FMT_DEPRECATED_OSTREAM
* cmake: do not define FMT_DEPRECATED_OSTREAM

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:57:36 +08:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Benny Halevy
01fc1a9f66 schema_tables: std::move mutation into the mutation vector
To save a copy.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#18120
2024-04-01 14:16:30 +03:00
Avi Kivity
731b5c5120 schema_tables: unfreeze frozen_mutation:s gently
With large schemas, unfreezing can stall, especially as it requires
a lot of memory. Switch to a gentle version that will not stall.
2024-03-17 17:46:02 +02:00
Benny Halevy
5bfca73b30 migration_manager: notify before_drop_column_family when dropping indices
Fixes #17627

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-03-14 20:19:12 +02:00
Benny Halevy
9cf6a2e510 schema_tables: make_update_indices_mutations: use find_schema to lookup the view of dropped indices
When dropping indices, we don't need to go through
`create_view_for_index` in order to drop the index.
That actually creates a new schema for this view
which is used just for its metadata for generating mutations
dropping it.

Instead, use `find_schema` to lookup the current schema
for the dropped index.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-03-14 20:19:11 +02:00
Marcin Maliszkiewicz
9144d8203b db: add system_auth_v2 keyspace
New keyspace is added similarly as system_schema keyspace,
it's being registred via system_keyspace::make which calls
all_tables to build its schema.

Dummy table 'roles' is added as keyspaces are being currently
registered by walking through their tables. Full table schemas
will be added in subsequent commits.

Change can be observed via cqlsh:

cassandra@cqlsh> describe keyspaces;

system_auth_v2  system_schema       system         system_distributed_everywhere
system_auth     system_distributed  system_traces

cassandra@cqlsh> describe keyspace system_auth_v2;

CREATE KEYSPACE system_auth_v2 WITH replication = {'class': 'LocalStrategy'}  AND durable_writes = true;

CREATE TABLE system_auth_v2.roles (
    role text PRIMARY KEY
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = 'comment'
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 604800
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';
2024-03-01 10:40:29 +01:00
Kefu Chai
0fbfc96619 db: add formatter for schema_tables::table_kind
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for db::schema_tables::table_kind,
and its operator<<() is still used by the homebrew generic formatter
for std::map<>, so it is preserved.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16972
2024-01-25 11:33:13 +03:00
Pavel Emelyanov
129196db98 schema_tables: Use new_keyspace() sugar
The create_keyspace_from_schema_partition code creates ks metadata
without schemas and user-types. There's new_keyspace() convenience
helper for such cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-26 13:26:58 +03:00
Pavel Emelyanov
c43501d973 locator,schema: Move initial tablets from r.s. options to params
The option is kepd in DDL, but is _not_ stored in
system_schema.keyspaces. Instead, it's removed from the provided options
and kept in scylla_keyspaces table in its own column. All the places
that had optional initial_tablets disengaged now set this value up the
way the find appropriate.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 16:07:10 +03:00
Pavel Emelyanov
30e7273658 schema_tables: Relax extract_scylla_specific_ks_info() check
Nowadays reading scylla-specific info from schema happens under
respective schema feature. However (at least in raft case) when a new
node joins the cluster merging schema for the first time may happen
_before_ features are merged and enabled. Thus merging schema can go the
wrong way by errorneously skipping the scylla-specific info.

On the other hand, if system_schema.scylla_keyspaces is there it's
there, there's no reason _not_ to pick this data up in that case.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 16:05:01 +03:00
Pavel Emelyanov
a67c535539 keyspace_metadata: Carry optional<initial_tablets> on board
The object in question fully describes the keyspace to be created and,
among other things, contains replication strategy options. Next patches
move the "initial_tablets" option out of those options and keep it
separately, so the ks metadata should also carry this option separately.

This patch is _just_ extending the metadata creation API, in fact the
new field is unused (write-only) so all the places that need to provide
this data keep it disengaged and are explicitly marked with FIXME
comment. Next patches will fix that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 15:58:05 +03:00
Nadav Har'El
8181e28731 secondary index: fix view creation when using tablets
In commit 88a5ddabce, we fixed materialized
view creation to support tablets. We added to the function called to
create materialized views in CQL, prepare_new_view_announcement()
a missing call to the on_before_create_column_family() notifier that
creates tablets for this new view.

Unfortunately, We have the same problem when creating a secondary index,
because it does not use prepare_new_view_announcement(), and instead uses
a generic function to "update" the base table, which in some cases ends
up creating new views when a new index is requested. In this path, the
notifier did not get called to the notifier, so we must add it here too.
Unfortunately, the notifiers must run in a Seastar thread, which means
that yet another function now needs to run in a Seastar thread.

Before this patch, creating a secondary index in a table using tablets
fails with "Tablet map not found for table <uuid>". With this patch,
it works.

The patch also includes tests for creating a regular and local secondary
index. Both tests fail (with the aforementioned error) before this
patch, and pass with it.

Fixes #16396

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-21 11:44:50 +02:00
Kamil Braun
522540da40 schema_tables: don't delete version cell from scylla_tables mutations from group 0
As explained in the previous commit, we use the new
`committed_by_group0` flag attached to each row of a `scylla_tables`
mutation to decide whether the `version` cell needs to be deleted or
not.

The rest of #13957 is solved by pre-existing code -- if the `version`
column is present in the mutation, we don't calculate a hash for
`schema::version()`, but take the value from the column:

```
table_schema_version schema_mutations::digest(db::schema_features sf)
const {
    if (_scylla_tables) {
        auto rs = query::result_set(*_scylla_tables);
        if (!rs.empty()) {
            auto&& row = rs.row(0);
            auto val = row.get<utils::UUID>("version");
            if (val) {
                return table_schema_version(*val);
            }
        }
    }

    ...
```

The issue will therefore be fixed once we enable
`GROUP0_SCHEMA_VERSIONING`.
2023-12-08 17:46:31 +01:00
Kamil Braun
defcf9915c migration_manager: add committed_by_group0 flag to system.scylla_tables mutations
As described in #13957, when creating or altering a table in group 0
mode, we don't want each node to calculate `schema::version()`s
independently using a hash algorithm. Instead, we want to all nodes to
use a single version for that table, commited by the group 0 command.

There's even a column ready for this in `system.scylla_tables` --
`version`. This column is currently being set for system tables, but
it's not being used for user tables.

Similarly to what we did with global schema version in earlier commits,
the obvious thing to do would be to include a live cell for the `version`
column in the `system.scylla_tables` mutation when we perform the schema
change in Raft mode, and to include a tombstone when performing it
outside of Raft mode, for the RECOVERY case.

But it's not that simple because as it turns out, we're *already*
sending a `version` live cell (and also a tombstone, with timestamp
decremented by 1) in all `system.scylla_tables` mutations. But then we
delete that cell when doing schema merge (which begs the question
why were we sending it in the first place? but I digress):
```
        // We must force recalculation of schema version after the merge, since the resulting
        // schema may be a mix of the old and new schemas.
        delete_schema_version(mutation);
```
the above function removes the `version` cell from the mutation.

So we need another way of distinguishing the cases of schema change
originating from group 0 vs outside group 0 (e.g. RECOVERY).

The method I chose is to extend `system.scylla_tables` with a boolean
column, `committed_by_group0`, and extend schema mutations to set
this column.

In the next commit we'll decide whether or not the `version` cell should
be deleted based on the value of this new column.
2023-12-08 17:46:31 +01:00
Kamil Braun
87b2c8a041 schema_tables: use schema version from group 0 if present
As promised in the previous commit, if we persisted a schema version
through a group 0 command, use it after a schema merge instead of
calculating a digest.

Ref: #7620

The above issue will be fixed once we enable the
`GROUP0_SCHEMA_VERSIONING` feature.
2023-12-08 17:46:31 +01:00
Patryk Jędrzejczak
c8ee7d4499 db: make schema commitlog feature mandatory
Using consistent cluster management and not using schema commitlog
ends with a bad configuration throw during bootstrap. Soon, we
will make consistent cluster management mandatory. This forces us
to also make schema commitlog mandatory, which we do in this patch.

A booting node decides to use schema commitlog if at least one of
the two statements below is true:
- the node has `force_schema_commitlog=true` config,
- the node knows that the cluster supports the `SCHEMA_COMMITLOG`
  cluster feature.

The `SCHEMA_COMMITLOG` cluster feature has been added in version
5.1. This patch is supposed to be a part of version 6.0. We don't
support a direct upgrade from 5.1 to 6.0 because it skips two
versions - 5.2 and 5.4. So, in a supported upgrade we can assume
that the version which we upgrade from has schema commitlog. This
means that we don't need to check the `SCHEMA_COMMITLOG` feature
during an upgrade.

The reasoning above also applies to Scylla Enterprise. Version
2024.2 will be based on 6.0. Probably, we will only support
an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even
if we support an upgrade from 2023.x, this patch won't break
anything because 2023.1 is based on 5.2, which has schema
commitlog. Upgrades from 2022.x definitely won't be supported.

When we populate a new cluster, we can use the
`force_schema_commitlog=true` config to use schema commitlog
unconditionally. Then, the cluster feature check is irrelevant.
This check could fail because we initiate schema commitlog before
we learn about the features. The `force_schema_commitlog=true`
config is especially useful when we want to use consistent cluster
management. Failing feature checks would lead to crashes during
initial bootstraps. Moreover, there is no point in creating a new
cluster with `consistent_cluster_management=true` and
`force_schema_commitlog=false`. It would just cause some initial
bootstraps to fail, and after successful restarts, the result would
be the same as if we used `force_schema_commitlog=true` from the
start.

In conclusion, we can unconditionally use schema commitlog without
any checks in 6.0 because we can always safely upgrade a cluster
and start a new cluster.

Apart from making schema commitlog mandatory, this patch adds two
changes that are its consequences:
- making the unneeded `force_schema_commitlog` config unused,
- deprecating the `SCHEMA_COMMITLOG` feature, which is always
  assumed to be true.

Closes scylladb/scylladb#16254
2023-12-04 21:02:16 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Kamil Braun
3582095b79 schema_tables: use smaller timestamp for base mutations included with view update
When a view schema is changed, the schema change command also includes
mutations for the corresponding base table; these mutations don't modify
the base schema but are included in case if the receiver of view
mutations somehow didn't receive base mutations yet (this may in theory
happen outside Raft mode).

There are situations where the schema change command contains both
mutations that describe the current state of the base table -- included
by a view update, as explained above -- and mutations that want to
modify the base table. Such situation arises, for example, when we
update a user-defined type which is referenced by both a view and its
corresponding base table. This triggers a schema change of the view,
which generates mutations to modify the view and includes mutations of
the current base schema, and at the same time it triggers a schema
change of the base, which generates mutations to modify the base.

These two sets of mutations are conflicting with each other. One set
wants to preserve the current state of the base table while the other
wants to modify it. And the two sets of mutations are generated using
the same timestamp, which means that conflict resolution between them is
made on a per-mutation-cell basis, comparing the values in each cell and
taking the "larger" one (meaning of "larger" depends on the type of each
cell).

Fortunately, this conflict is currently benign -- or at least there is
no known situation where it causes problems.

Unfortunately, it started causing problems when I attempted to implement
group 0 schema versioning (PR scylladb/scylladb#15331), where instead of
calculating table versions as hashes of schema mutations, we would send
versions as part of schema change command. These versions would be
stored inside the `system_schema.scylla_tables` table, `version` column,
and sent as part of schema change mutations.

And then the conflict showed. One set of mutations wanted to preserve
the old value of `version` column while the other wanted to update it.
It turned out that sometimes the old `version` prevailed, because the
`version` column in `system_schema.scylla_tables` uses UUID-based
comparison (not timeuuid-based comparison). This manifested as issue
scylladb/scylladb#15530.

To prevent this, the idea in this commit is simple: when generating
mutations for the base table as part of corresponding view update, do
not use the provided timestamp directly -- instead, decrement it by one.
This way, if the schema change command contains mutations that want to
modify the base table, these modifying mutations will win all conflicts
based on the timestamp alone (they are using the same provided
timestamp, but not decremented).

One could argue that the choice of this timestamp is anyway arbitrary.
The original purpose of including base mutations during view update was
to ensure that a node which somehow missed the base mutations, gets them
when applying the view. But in that case, the "most correct" solution
should have been to use the *original* base mutations -- i.e. the ones
that we have on disk -- instead of generating new mutations for the base
with a refreshed timestamp. The base mutations that we have on disk have
smaller timestamps already (since these mutations are from the past,
when the base was last modified or created), so the conflict would also
not happen in this case.

But that solution would require doing a disk read, and we can avoid the
read while still fixing the conflict by using an intermediate solution:
regenerating the mutations but with `timestamp - 1`.

Ref: scylladb/scylladb#15530

Closes scylladb/scylladb#16139
2023-11-28 21:51:18 +01:00