Add to the schema object a member that points to the CDC schema object
that is compatible with this schema, if any.
The compatible CDC schema is created and altered with its base schema in
the same group0 operation.
When generating CDC log mutations for some base mutation we want them to
be created using a compatible schema thas has a CDC column corresponding
to each base column. This change will allow us to find the right CDC
schema given a base mutation.
We also update the relevant structures in the schema registry that are
related to learning about schemas and transporting schemas across
shards or nodes.
When transporting a schema as frozen_schema, we need to transport the
frozen cdc schema as well, and set it again when unfreezing and
reconstructing the schema.
When adding a schema to the registry, we need to ensure its CDC schema
is added to the registry as well.
Currently we always set the CDC schema to nullptr and maintain the
previous behavior. We will change it in a later commit. Until then, we
mark all places where CDC schema is passed clearly so we don't forget
it.
remove the _base_info member from global_schema_ptr, and used the
base_info we have stored in the schema registry entry instead.
Currently when constructing a global_schema_ptr from a schema_ptr it
extracts and stores the base_info from the schema_ptr. Later it uses it
to reconstruct the schema_ptr, together with the frozen schema from the
schema registry entry.
But we can use the base_info that is already stored in the
schema registry entry.
Change the schema loader type in the schema_registry to return a
extended_frozen_schema instead of view_schema_and_base_info, and
remove view_schema_and_base_info which is not used anymore.
The casting between them is trivial.
The schema_registry_entry holds a frozen_schema and a base_info. The
base_info is extracted from the schema_ptr on load of a schema_ptr, and
it is used when unfreezing the schema.
But this is exactly what extended_frozen_schema is doing, so we can
just store an object of this type in the schema_registry_entry.
This makes the code simpler because the schema registry doesn't need to
be aware of the base_info.
Currently we construct a frozen schema with base info in few places, and
the caller is responsible for constructing the frozen schema and extracting
the base info if it's a view table.
We change it to make it simpler and remove the burden from the caller.
The caller can simply pass the schema_ptr, and the constructor for
extended_frozen_schema will construct the frozen schema and extract
the additional info it needs. This will make it easier to add additional
fields, and reduces code duplication.
We also make temporary castings between extended_frozen_schema and
view_schema_and_base_info for the transition, which are trivial, until
they are combined to a single type.
In the previous commits we made sure that the base info is not dependent
on the base schema version, and the info dependent on the base schema
version is calculated when it's needed. In this patch we remove the
unnecessary re-setting of the base_info.
The set_base_info method isn't removed completely, because it also has
a secondary function - zeroing the view_info fields other than base_info.
Because of this, in this patch we rename it accordingly and limit its
use to the updates caused by a base schema change.
The base info now only contains values which are not reliant on the
base schema version. We remove the the base schema from the base info
to make it immutable regardless of base schema version, at the point
of this patch it's also not needed anywhere - the new base info can
replace the base schema in most places, and in the few (view_updates)
where we need it, we pull the most recent base schema version from
the database.
After this change, the base info no longer changes in a view schema
after creation, so we'll no longer get errors when we try generating
view updates with a base_info that's incompatible with a specific
base schema version.
Fixes#9059Fixes#21292Fixes#22410
In the following patch we plan to remove the base schema from the base_info
to make the base_info immutable. To do that, we first prepare the schema
registry for the change; we need to be able to create view schemas from
frozen schemas there and frozen schemas have no information about the base
table. Unless we do this change, after base schemas are removed from the
base info, we'll no longer be able to load a view schema to the schema registry
without looking up the base schema in the database.
This change also required some updates to schema building:
* we add a method for unfreezing a view schema with base info instead of
a base schema
* we make it possible to use schema_builder with a base info instead of
a base schema
* we add a method for creating a view schema from mutations with a base info
instead of a base schema
* we add a view_info constructor withat base info instead of a base schema
* we update the naming in schema_registry to reflect the usage of base info
instead of base schema
Currently, the base_info may or may not be set in view schemas.
Even when it's set, it may be modified. This necessitates extra
checks when handling view schemas, as well as potentially causing
errors when we forget to set it at some point.
Instead, we want to make the base info an immutable member of view
schemas (inside view_info). The first step towards that is making
sure that all newly created schemas have the base info set.
We achieve that by requiring a base schema when constructing a view
schema. Unfortunately, this adds complexity each time we're making
a view schema - we need to get the base schema as well.
In most cases, the base schema is already available. The most
problematic scenario is when we create a schema from mutations:
- when parsing system tables we can get the schema from the
database, as regular tables are parsed before views
- when loading a view schema using the schema loader tool, we need
to load the base additionally to the view schema, effectively
doubling the work
- when pulling the schema from another node - in this case we can
only get the current version of the base schema from the local
database
Additionally, we need to consider the base schema version - when
we generate view updates the version of the base schema used for
reads should match the version of the base schema in view's base
info.
This is achieved by selecting the correct (old or new) schema in
`db::schema_tables::merge_tables_and_views` and using the stored
base schema in the schema_registry.
After the previous patches, the view schemas returned by schema registry
always have their base info set. As such, we no longer need to set it after
getting the view schema from the registry. This patch removes these
unnecessary updates.
The schema registry now holds base schemas for view schemas.
The base schema may change without changing the view schema, so to
preserve the change in the schema registry, we also update the
base schema in the registry when updating the base info in the
view schema.
Currently, when we load a frozen schema into the registry, we lose
the base info if the schema was of a view. Because of that, in various
places we need to set the base info again, and in some codepaths we
may miss it completely, which may make us unable to process some
requests (for example, when executing reverse queries on views).
Even after setting the base info, we may still lose it if the schema
entry gets deactivated.
To fix this, this patch adds the base schema to the registry, alongside
the view schema. With the base schema, we can now set the base
info when returning the schema from the registry. As a result, we can now
assume that all view schemas returned by the registry have base_info set.
To store the base schema, the loader methods now have to return the base
schema alongside the view schema. At the same time, when loading into
the registry, we need to check whether we're loading a view schema, and if
so, we need to also provide the base schema. When inserting a regular table
schema, the base schema should be a disengaged optional.
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.
in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.
Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.
To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.
[1] 66ef711d68Closesscylladb/scylladb#20006
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.
Replace with seastar::future::get(), which does the same thing.
The schema registry disarms internal timers when it is destroyed.
This accesses the Seastar reactor. However, after [1] we don't have ordering
between the reactor destruction and the thread_local registry destruction.
Fix this by flushing all entries when the database is destroyed. The
database object is fundamental so it's unlikely we'll have anything
using the registry after it's gone.
[1] 101b245ed7
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
This will make it easier to access table proprties in places which
only have schema_ptr. This is in particular useful when replacing
dht::shard_of() uses with s->table().shard_of(), now that sharding is
no longer static, but table-specific.
Also, it allows us to install a guard which catches invalid uses of
schema::get_sharder() on tablet-based tables.
It will be helpful for other uses as well. For example, we can now get
rid of the static_props hack.
The netyr may exist, but its schema may not yet be loaded. learn()
didn't take that into account. This problem is not reachable in
production code, which currently always calls get_or_load() before
learn(), except for boot, but there's no concurrency at that point.
Exposed by unit test added later.
System tables have static schemas and code uses those static schemas
instead of looking them up in the database. We want those schemas to
have a valid table() once the table is created, so we need to attach
registry entry to the target schema rather than to a schema duplicate.
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858