before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.
that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:
```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
265 | return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
| ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
4290 | format(format_string<_Args...> __fmt, _Args&&... __args)
| ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
143 | format(fmt::format_string<A...> fmt, A&&... a) {
| ^
```
in this change, we
change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
`seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
because, `sstring::operator std::basic_string` would incur a deep
copy.
we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.
Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.
To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.
[1] 66ef711d68Closesscylladb/scylladb#20006
thrift support was deprecated since ScyllaDB 5.2
> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.
so let's drop it. in this change,
* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
preserved for backward compatibility, as we could load
from an existing system.local table which still contains
this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
they are marked as "Unused". so that the new release
of scylladb can consume existing scylla.yaml configurations
which might contain these settings. by making them
deprecated, user will be able get warned, and update
their configurations before we actually remove them
in the next major release.
Fixes#3811Fixes#18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define formatters for
cdc::image_mode and cdc::delta_mode, and drop their operator<<:s.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#17381
Now all the callers have it at hands (spoiler: not yet initialized, but
still) so the params can also have it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Previous patch added params to r.s. classes' constructors, but callers
don't construct those directly, instead they use the create_r.s.()
wrapper. This patch adds params to the wrapper too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We do not yet support enabling CDC in a keyspace that uses tablets
(Refs #16317). But the problem is that today, if this is attempted,
we get a nasty failure: the CDC code creates the extra CDC log table,
it doesn't get tablets, and Raft gets surprised and croaks with a
message like:
Raft instance is stopped, reason: "background error,
std::_Nested_exceptionraft::state_machine_error (State machine error at
raft/server.cc:1230): std::runtime_error (Tablet map not found for
table 48ca1620-9ea5-11ee-bd7c-22730ed96b85)
After Raft croaks, Scylla never recovers until it is rebooted.
In this patch, we replace this disaster by a graceful error - a CREATE
TABLE or ALTER TABLE operation with CDC enabled will fail in a clear way,
and allowing Scylla to continue operating normally after this failed request.
This fix is important for allowing us to run tests on Scylla with
tablets, and although CDC tests will fail as expected, they won't
fail the other tests that follow (Refs #16473).
Fixes#16318
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#16474
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
After adding the new prepare_new_column_family_announcement that
doesn't assume the existence of a keyspace, we also need to get
rid of the same assumption in all on_before_create_column_family
calls. After all, they may be initiated before creating the
keyspace. However, some listeners require keyspace_metadata, so we
pass it as a new parameter.
This reverts commit 4b80130b0b, reversing
changes made to a5519c7c1f. It's suspected
of causing dtest failures due to a bug in coroutine::parallel_for_each.
After adding the new prepare_new_column_family_announcement that
doesn't assume the existence of a keyspace, we also need to get
rid of the same assumption in all on_before_create_column_family
calls. After all, they may be initiated before creating the
keyspace. However, some listeners require keyspace_metadata, so we
pass it as a new parameter.
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.
fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.
in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.
sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.
also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.
please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.
this change is a cleanup to modernize the code base with C++20
features.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13687
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.
the building systems are updated accordingly.
the source files referencing `types.hh` were updated using following
command:
```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```
the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12926
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858
When we start allowing NULL in lists in some contexts, the exact
location where an error is raised (when it's disallowed) will
change. To prepare for that, relax the exception check to just
ensure the word NULL is there, without caring about the exact
wording.
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).
A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.
Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.
One test checking the 16-bit serialization format is removed.
Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.
Fixes#11207
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Allow outside code to use it to determine whether a table is cdc or not.
This is currently the most reliable method if the custom partitioner is
not set on the schema of the investigated table.
Fixes#10489
Killing the CDC log table on CDC disable is unhelpful in many ways,
partly because it can cause random exceptions on nodes trying to
do a CDC-enabled write at the same time as log table is dropped,
but also because it makes it impossible to collect data generated
before CDC was turned off, but which is not yet consumed.
Since data should be TTL:ed anyway, retaining the table should not
really add any overhead beyond the compaction to eventually clear
it. And user did set TTL=0 (disabled), then he is already responsible
for clearing out the data.
This also has the nice feature of meshing with the alternator streams
semantics.
Closes#10601
Functionality of the relation class has been replaced by
expr::to_restriction.
Relation and all classes deriving from it can now be removed.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
If we are redefining the log table, we need to ensure any dropped
columns are registered in "dropped_columns" table, otherwise clients will not
be able to read data older than now.
Includes unit test.
Should probably be backported to all CDC enabled versions.
Fixes#10473Closes#10474
CDC registers to the table-creation hook (before_create_column_family)
to add a second table - the CDC log table - to the same keyspace.
The handler function (on_before_update_column_family() in cdc/log.cc)
wants to retrieve the keyspace's definition, but that does NOT WORK if
we create the keyspace and table in one operation (which is exactly what
we intend to do in Alternator to solve issue #9868) - because at the
time of the hook, the keyspace does not yet exist in the schema.
It turns out that on_before_update_column_family() does not REALLY need
the keyspace. It needed it to pass it on to make_create_table_mutations()
but that function doesn't use the keyspace parameter passed to it! All
it needs is the keyspace's name - which is in the schema anyway and
doesn't need to be looked up.
So in this patch we fix make_create_table_mutations() to not require the
unused keyspace parameter - and fix the CDC code not to look for the
keyspace that is no longer needed.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220215162342.622509-1-nyh@scylladb.com>
Base tables that use compact storage may have a special artificial
column that has an empty type.
c010cefc4d fixed the main CDC path to
handle such columns correctly and to not include them in the CDC Log
schema.
This patch makes sure that generation of preimage ignores such empty
column as well.
Fixes#9876Closes#9910
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.
References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.
scylla-gdb.py is adjusted to look for both the new and old names.
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.
As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
The problem was that such a command:
```
alter table ks.cf with cdc={'ttl': 120};
```
would assume that "enabled" parameter is the default ("false") and,
in effect, disable CDC on that table. This commit forces the user
to specify that key.
Fixes#6475Closes#9720
This PR finally removes the `term` class and replaces it with `expression`.
* There was some trouble with `lwt_cache_id` in `expr::function_call`.
The current code works the following way:
* for each `function_call` inside a `term` that describes a pk restriction, `prepare_context::add_pk_function_call` is called.
* `add_pk_function_call` takes a `::shared_ptr<cql3::functions::function_call>`, sets its `cache_id` and pushes this shared pointer onto a vector of all collected function calls
* Later when some condiition is met we want to clear cache ids of all those collected function calls. To do this we iterate through shared pointers collected in `prepare_context` and clear cache id for each of them.
This doesn't work with `expr::function_call` because it isn't kept inside a shared pointer.
To solve this I put the `lwt_cache_id` inside a shared pointer and then `prepare_context` collects these shared pointers to cache ids.
I also experimented with doing this without any shared pointers, maybe we could just walk through the expression and clear the cache ids ourselves. But the problem is that expressions are copied all the time, we could clear the cache in one place, but forget about a copy. Doing it using shared pointers more closely matches the original behaviour.
The experiment is on the [term2-pr3-backup-altcache](https://github.com/cvybhu/scylla/tree/term2-pr3-backup-altcache) branch
* `shared_ptr<term>` being `nullptr` could mean:
* It represents a cql value `null`
* That there is no value, like `std::nullopt` (for example in `attributes.hh`)
* That it's a mistake, it shouldn't be possible
A good way to distinguish between optional and mistake is to look for `my_term->bind_and_get()`, we then know that it's not an optional value.
* On the other hand `raw_value` cased to bool means:
* `false` - null or unset
* `true` - some value, maybe empty
I ran a simple benchmark on my laptop to see how performance is affected:
```
build/release/test/perf/perf_simple_query --smp 1 -m 1G --operations-per-shard 1000000 --task-quota-ms 10
```
* On master (a21b1fbb2f) I get:
```
176506.60 tps ( 77.0 allocs/op, 12.0 tasks/op, 45831 insns/op)
median 176506.60 tps ( 77.0 allocs/op, 12.0 tasks/op, 45831 insns/op)
median absolute deviation: 0.00
maximum: 176506.60
minimum: 176506.60
```
* On this branch I get:
```
172225.30 tps ( 75.1 allocs/op, 12.1 tasks/op, 46106 insns/op)
median 172225.30 tps ( 75.1 allocs/op, 12.1 tasks/op, 46106 insns/op)
median absolute deviation: 0.00
maximum: 172225.30
minimum: 172225.30
```
Closes#9481
* github.com:scylladb/scylla:
cql3: Remove remaining mentions of term
cql3: Remove term
cql3: Rename prepare_term to prepare_expression
cql3: Make prepare_term return an expression instead of term
cql3: expr: Add size check to evaluate_set
cql3: expr: Add expr::contains_bind_marker
cql3: expr: Rename find_atom to find_binop
cql3: expr: Add find_in_expression
cql3: Remove term in operations
cql3: Remove term in relations
cql3: Remove term in multi_column_restrictions
cql3: Remove term in term_slice, rename to bounds_slice
cql3: expr: Remove term in expression
cql3: expr: Add evaluate_IN_list(expression, options)
cql3: Remove term in column_condition
cql3: Remove term in select_statement
cql3: Remove term in update_statement
cql3: Use internal cql format in insert_prepared_json_statement cache
types: Add map_type_impl::serialize(range of <bytes, bytes>)
cql3: Remove term in cql3/attributes
cql3: expr: Add constant::view() method
cql3: expr: Implement fill_prepare_context(expression)
cql3: expr: add expr::visit that takes a mutable expression
cql3: expr: Add receiver to expr::bind_variable
This is an undocumented feature that causes confusion so let's get rid
of it.
tests: unit(dev)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Closes#9639
This is to push the service towards general idea that each
component should have its own config and db::config to stay
in main.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
All callers has been patched already. This argument can now
be used to replace get_local_storage_proxy().get_db().local()
call.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Get rid of unused includes of seastar/util/{defer,closeable}.hh
and add a few that are missing from source files.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Fixes#9103
compare overload was declared as "bool" even though it is a tri-cmp.
causes us to never use the speed-up shortcut (lessen search set),
in turn meaning more overhead for collections.
Closes#9104
When a table with compact storage has no regular column (only primary
key columns), an artificial column of type empty is added. Such column
type can't be returned via CQL so CDC Log shouldn't contain a column
that reflects this artificial column.
This patch does two things:
1. Make sure that CDC Log schema does not contain columns that reflect
the artificial column from a base table.
2. When composing mutation to CDC Log, ommit the artificial column.
Fixes#8410
Test: unit(dev)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Closes#8988
"
Storage service needs migration notifier reference to pass it to cdc
service via get_local_storage_service(). This set removes
- get_local_storage_service from cdc
- migration notifier from storage service
- db_context::builder from cdc (released nuclear binding energy)
tests: unit(dev)
"
* 'br-cdc-no-storage-service' of https://github.com/xemul/scylla:
storage_service: Remove migration notifier dependency
cdc: Remove db_context::builder
cdc: Provide migration notifier right at once
cdc: Remove db_context::builder::with_migration_notifier
Right now the builder is just an opaque transfer between cdc_service
constructor args and cdc_service's db_context constructor args.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The only way db_context's migration notifier reference is set up
is via cdc_service->db_context::builder->.build chain of calls.
Since the builder's notifier optional reference is always
disengaged (the .with_migration_notifier is removed by previous
patch) the only possible notifier reference there is from the
storage service which, in turn, is the same as in main.cc.
Said that -- push the notifier reference onto db_context directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Before this change, cdc$deleted_ columns were all NULL in pre-images.
Lack of such information made it hard to correctly interpret the
pre-image rows, for example:
INSERT INTO tbl(pk, ck, v, v2) VALUES (1, 1, null, 1);
INSERT INTO tbl(pk, ck, v2) VALUES (1, 1, 1);
For this example, pre-image generated for the second operation
would look like this (in both 'true' and 'full' pre-image mode):
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
v=NULL has two meanings:
1. If pre-image was in 'true' mode, v=NULL describes that v was not
affected (affected columns: pk, ck, v2).
2. If pre-image was in 'full' mode, v=NULL describes that v was equal
to NULL in the pre-image.
Therefore, to properly decode pre-images you would need to know in
which mode pre-image was configured on the CDC-enabled table at the
moment this CDC log row was inserted. There is no way to determine
such information (you can only check a current mode of pre-image).
A solution to this problem is to fill in the cdc$deleted_ columns
for pre-images. After this change, for the INSERT described above, CDC
now generates the following log row:
If in pre-image 'true' mode:
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
If in pre-image 'full' mode:
pk=1, ck=1, v=NULL, cdc$deleted_v=true, v2=1
A client library now can properly decode a pre-image row. If it sees
a NULL value, it can now check the cdc$deleted_ column to determine
if this NULL value was a part of pre-image or it was omitted due to
not being an affected column in the delta operation.
No such change is necessary for the post-image rows, as those images
are always generated in the 'full' mode.
Additional example of trouble decoding pre-images before this change.
tbl2 - 'true' pre-image mode, tbl3 - 'full' pre-image mode:
INSERT INTO tbl2(pk, ck, v, v2) VALUES (1, 1, 5, 1);
INSERT INTO tbl3(pk, ck, v, v2) VALUES (1, 1, null, 1);
INSERT INTO tbl2(pk, ck, v2) VALUES (1, 1, 1);
generated pre-image:
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
INSERT INTO tbl3(pk, ck, v2) VALUES (1, 1, 1);
generated pre-image:
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1
Both pre-images look the same, but:
1. v=NULL in tbl2 describes v being omitted from the pre-image.
2. v=NULL in tbl3 described v being NULL in the pre-image.