Commit Graph

189 Commits

Author SHA1 Message Date
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Kefu Chai
ad649be1bf treewide: drop thrift support
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
  preserved for backward compatibility, as we could load
  from an existing system.local table which still contains
  this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
  backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
  they are marked as "Unused". so that the new release
  of scylladb can consume existing scylla.yaml configurations
  which might contain these settings. by making them
  deprecated, user will be able get warned, and update
  their configurations before we actually remove them
  in the next major release.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 06:44:59 +08:00
Kefu Chai
af2553e8bc cdc: add formatter for cdc::image_mode and cdc::delta_mode
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
cdc::image_mode and cdc::delta_mode, and drop their operator<<:s.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17381
2024-02-18 19:21:53 +02:00
Kefu Chai
6c06751640 cdc: not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16725
2024-01-11 09:13:37 +02:00
Pavel Emelyanov
562fcf0c19 locator: Keep optional initial_tablets on r.s. params
Now all the callers have it at hands (spoiler: not yet initialized, but
still) so the params can also have it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 16:02:41 +03:00
Pavel Emelyanov
a943bd927b locator: Call create_replication_strategy() with r.s. params
Previous patch added params to r.s. classes' constructors, but callers
don't construct those directly, instead they use the create_r.s.()
wrapper. This patch adds params to the wrapper too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-12-25 15:54:59 +03:00
Nadav Har'El
7ee55dd03e cdc, tablets: don't allow enabling CDC with tablets
We do not yet support enabling CDC in a keyspace that uses tablets
(Refs #16317). But the problem is that today, if this is attempted,
we get a nasty failure: the CDC code creates the extra CDC log table,
it doesn't get tablets, and Raft gets surprised and croaks with a
message like:

    Raft instance is stopped, reason: "background error,
    std::_Nested_exceptionraft::state_machine_error (State machine error at
    raft/server.cc:1230): std::runtime_error (Tablet map not found for
    table 48ca1620-9ea5-11ee-bd7c-22730ed96b85)

After Raft croaks, Scylla never recovers until it is rebooted.

In this patch, we replace this disaster by a graceful error -  a CREATE
TABLE or ALTER TABLE operation with CDC enabled will fail in a clear way,
and allowing Scylla to continue operating normally after this failed request.

This fix is important for allowing us to run tests on Scylla with
tablets, and although CDC tests will fail as expected, they won't
fail the other tests that follow (Refs #16473).

Fixes #16318

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#16474
2023-12-20 10:06:34 +01:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Patryk Jędrzejczak
a762179972 migration_listener: add parameter to on_before_create_column_family
After adding the new prepare_new_column_family_announcement that
doesn't assume the existence of a keyspace, we also need to get
rid of the same assumption in all on_before_create_column_family
calls. After all, they may be initiated before creating the
keyspace. However, some listeners require keyspace_metadata, so we
pass it as a new parameter.
2023-10-31 12:08:03 +01:00
Avi Kivity
d450a145ce Revert "Merge 'reduce announcements of the automatic schema changes ' from Patryk Jędrzejczak"
This reverts commit 4b80130b0b, reversing
changes made to a5519c7c1f. It's suspected
of causing dtest failures due to a bug in coroutine::parallel_for_each.
2023-10-29 18:32:06 +02:00
Patryk Jędrzejczak
7653059369 migration_listener: add parameter to on_before_create_column_family
After adding the new prepare_new_column_family_announcement that
doesn't assume the existence of a keyspace, we also need to get
rid of the same assumption in all on_before_create_column_family
calls. After all, they may be initiated before creating the
keyspace. However, some listeners require keyspace_metadata, so we
pass it as a new parameter.
2023-10-16 14:59:53 +02:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Kefu Chai
df63e2ba27 types: move types.{cc,hh} into types
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.

the building systems are updated accordingly.

the source files referencing `types.hh` were updated using following
command:

```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```

the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12926
2023-02-19 21:05:45 +02:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Avi Kivity
69a385fd9d Introduce schema/ module
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.

Closes #12858
2023-02-15 11:01:50 +02:00
Avi Kivity
00145f9ada test: relax NULL check test predicate
When we start allowing NULL in lists in some contexts, the exact
location where an error is raised (when it's disallowed) will
change. To prepare for that, relax the exception check to just
ensure the word NULL is there, without caring about the exact
wording.
2023-01-18 10:38:24 +02:00
Avi Kivity
2739ac66ed treewide: drop cql_serialization_format
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).

A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.

Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.

One test checking the 16-bit serialization format is removed.
2023-01-03 19:54:13 +02:00
Botond Dénes
d1d53f1b84 query: add tombstone-limit to read-command
Propagate the tombstone-limit from coordinator to replicas, to make sure
all is using the same limit.
2022-08-10 06:01:47 +03:00
Benny Halevy
257d74bb34 schema, everywhere: define and use table_id as a strong type
Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.

Fixes #11207

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-08 08:09:41 +03:00
Botond Dénes
f8a8fe41d6 cdc/log.hh: expose is_log_name()
Allow outside code to use it to determine whether a table is cdc or not.
This is currently the most reliable method if the custom partitioner is
not set on the schema of the investigated table.
2022-06-10 10:57:12 +03:00
Calle Wilund
adda43edc7 CDC - do not remove log table on CDC disable
Fixes #10489

Killing the CDC log table on CDC disable is unhelpful in many ways,
partly because it can cause random exceptions on nodes trying to
do a CDC-enabled write at the same time as log table is dropped,
but also because it makes it impossible to collect data generated
before CDC was turned off, but which is not yet consumed.

Since data should be TTL:ed anyway, retaining the table should not
really add any overhead beyond the compaction to eventually clear
it. And user did set TTL=0 (disabled), then he is already responsible
for clearing out the data.

This also has the nice feature of meshing with the alternator streams
semantics.

Closes #10601
2022-05-31 19:07:07 +03:00
cvybhu
d85f680df3 cql3: Remove relation class
Functionality of the relation class has been replaced by
expr::to_restriction.

Relation and all classes deriving from it can now be removed.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
Calle Wilund
78350a7e1b cdc: Ensure columns removed from log table are registered as dropped
If we are redefining the log table, we need to ensure any dropped
columns are registered in "dropped_columns" table, otherwise clients will not
be able to read data older than now.
Includes unit test.

Should probably be backported to all CDC enabled versions.

Fixes #10473
Closes #10474
2022-05-04 14:19:39 +02:00
Nadav Har'El
7be3129458 cdc: don't need current keyspace to create the log table
CDC registers to the table-creation hook (before_create_column_family)
to add a second table - the CDC log table - to the same keyspace.
The handler function (on_before_update_column_family() in cdc/log.cc)
wants to retrieve the keyspace's definition, but that does NOT WORK if
we create the keyspace and table in one operation (which is exactly what
we intend to do in Alternator to solve issue #9868) - because at the
time of the hook, the keyspace does not yet exist in the schema.

It turns out that on_before_update_column_family() does not REALLY need
the keyspace. It needed it to pass it on to make_create_table_mutations()
but that function doesn't use the keyspace parameter passed to it! All
it needs is the keyspace's name - which is in the schema anyway and
doesn't need to be looked up.

So in this patch we fix make_create_table_mutations() to not require the
unused keyspace parameter - and fix the CDC code not to look for the
keyspace that is no longer needed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220215162342.622509-1-nyh@scylladb.com>
2022-02-16 08:38:56 +02:00
Piotr Jastrzebski
09d4438a0d cdc: Handle compact storage correctly in preimage
Base tables that use compact storage may have a special artificial
column that has an empty type.

c010cefc4d fixed the main CDC path to
handle such columns correctly and to not include them in the CDC Log
schema.

This patch makes sure that generation of preimage ignores such empty
column as well.

Fixes #9876
Closes #9910

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2022-01-20 13:23:38 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Juliusz Stasiewicz
5a8741a1ca cdc: Throw when ALTERing cdc options without "enabled":"..."
The problem was that such a command:
```
alter table ks.cf with cdc={'ttl': 120};
```
would assume that "enabled" parameter is the default ("false") and,
in effect, disable CDC on that table. This commit forces the user
to specify that key.

Fixes #6475

Closes #9720
2021-12-07 17:37:44 +02:00
Avi Kivity
595cc328b1 Merge 'cql3: Remove term, replace with expression' from Jan Ciołek
This PR finally removes the `term` class and replaces it with `expression`.

* There was some trouble with `lwt_cache_id` in `expr::function_call`.
  The current code works the following way:
  * for each `function_call` inside a `term` that describes a pk restriction, `prepare_context::add_pk_function_call` is called.
  * `add_pk_function_call` takes a `::shared_ptr<cql3::functions::function_call>`, sets its `cache_id` and pushes this shared pointer onto a vector of all collected function calls
  * Later when some condiition is met we want to clear cache ids of all those collected function calls. To do this we iterate through shared pointers collected in `prepare_context` and clear cache id for each of them.

  This doesn't work with `expr::function_call` because it isn't kept inside a shared pointer.
  To solve this I put the `lwt_cache_id` inside a shared pointer and then `prepare_context` collects these shared pointers to cache ids.

  I also experimented with doing this without any shared pointers, maybe we could just walk through the expression and clear the cache ids ourselves. But the problem is that expressions are copied all the time, we could clear the cache in one place, but forget about a copy. Doing it using shared pointers more closely matches the original behaviour.
The experiment is on the [term2-pr3-backup-altcache](https://github.com/cvybhu/scylla/tree/term2-pr3-backup-altcache) branch
* `shared_ptr<term>` being `nullptr` could mean:
  * It represents a cql value `null`
  * That there is no value, like `std::nullopt` (for example in `attributes.hh`)
  * That it's a mistake, it shouldn't be possible

  A good way to distinguish between optional and mistake is to look for `my_term->bind_and_get()`, we then know that it's not an optional value.

* On the other hand `raw_value` cased to bool means:
   * `false` - null or unset
   * `true` - some value, maybe empty

I ran a simple benchmark on my laptop to see how performance is affected:
```
build/release/test/perf/perf_simple_query --smp 1 -m 1G --operations-per-shard 1000000 --task-quota-ms 10
```
* On master (a21b1fbb2f) I get:
  ```
  176506.60 tps ( 77.0 allocs/op,  12.0 tasks/op,   45831 insns/op)

  median 176506.60 tps ( 77.0 allocs/op,  12.0 tasks/op,   45831 insns/op)
  median absolute deviation: 0.00
  maximum: 176506.60
  minimum: 176506.60
  ```
* On this branch I get:
  ```
  172225.30 tps ( 75.1 allocs/op,  12.1 tasks/op,   46106 insns/op)

  median 172225.30 tps ( 75.1 allocs/op,  12.1 tasks/op,   46106 insns/op)
  median absolute deviation: 0.00
  maximum: 172225.30
  minimum: 172225.30
  ```

Closes #9481

* github.com:scylladb/scylla:
  cql3: Remove remaining mentions of term
  cql3: Remove term
  cql3: Rename prepare_term to prepare_expression
  cql3: Make prepare_term return an expression instead of term
  cql3: expr: Add size check to evaluate_set
  cql3: expr: Add expr::contains_bind_marker
  cql3: expr: Rename find_atom to find_binop
  cql3: expr: Add find_in_expression
  cql3: Remove term in operations
  cql3: Remove term in relations
  cql3: Remove term in multi_column_restrictions
  cql3: Remove term in term_slice, rename to bounds_slice
  cql3: expr: Remove term in expression
  cql3: expr: Add evaluate_IN_list(expression, options)
  cql3: Remove term in column_condition
  cql3: Remove term in select_statement
  cql3: Remove term in update_statement
  cql3: Use internal cql format in insert_prepared_json_statement cache
  types: Add map_type_impl::serialize(range of <bytes, bytes>)
  cql3: Remove term in cql3/attributes
  cql3: expr: Add constant::view() method
  cql3: expr: Implement fill_prepare_context(expression)
  cql3: expr: add expr::visit that takes a mutable expression
  cql3: expr: Add receiver to expr::bind_variable
2021-11-30 16:39:39 +02:00
Piotr Jastrzebski
033a75ff96 cdc: Don't support "on" and "off" values for preimage any more
This is an undocumented feature that causes confusion so let's get rid
of it.

tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>

Closes #9639
2021-11-17 11:54:11 +01:00
Jan Ciolek
e458340821 cql3: Remove term
term isn't used anywhere now. We can remove it and all classes that derive from it.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Avi Kivity
6b02aa72e2 cdc: replace seastar::sprint() with fmt::format()
sprint() is obsolete.
2021-10-27 14:30:06 +03:00
Pavel Emelyanov
db623c5f64 cdc: Replace db::config with generation_service::config
This is to push the service towards general idea that each
component should have its own config and db::config to stay
in main.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
0fd00d7016 cdc: Add database argument to is_log_for_some_table
All callers has been patched already. This argument can now
be used to replace get_local_storage_proxy().get_db().local()
call.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-27 14:07:26 +03:00
Pavel Emelyanov
fe8bc0757b cdc: Add database argument to check_for_attempt_to_create_nested_cdc_log
The only caller of it already has database argument, just
pass it a bit further

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-27 14:07:18 +03:00
Benny Halevy
4439e5c132 everywhere: cleanup defer.hh includes
Get rid of unused includes of seastar/util/{defer,closeable}.hh
and add a few that are missing from source files.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-22 21:11:39 +03:00
Asias He
6350a19f73 compaction: Move compaction_strategy.hh to compaction dir
The top dir is a mess. Move compaction_strategy.hh and
compaction_strategy_type.hh to the new home.
2021-08-07 08:06:37 +08:00
Avi Kivity
e52ebe2da5 types: convert abstract_type::compare and related to std::strong_ordering
Change comparators around types to std::strong_ordering.

Ref #1449.
2021-07-28 13:19:24 +03:00
Calle Wilund
59555fa363 cdc: fix broken function signature in maybe_back_insert_iterator
Fixes #9103

compare overload was declared as "bool" even though it is a tri-cmp.
causes us to never use the speed-up shortcut (lessen search set),
in turn meaning more overhead for collections.

Closes #9104
2021-07-27 20:37:30 +03:00
Piotr Jastrzebski
c010cefc4d cdc: Handle compact storage tables correctly
When a table with compact storage has no regular column (only primary
key columns), an artificial column of type empty is added. Such column
type can't be returned via CQL so CDC Log shouldn't contain a column
that reflects this artificial column.

This patch does two things:
1. Make sure that CDC Log schema does not contain columns that reflect
   the artificial column from a base table.
2. When composing mutation to CDC Log, ommit the artificial column.

Fixes #8410

Test: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>

Closes #8988
2021-07-12 12:17:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
fff7ef1fc2 treewide: reduce boost headers usage in scylla header files
`dev-headers` target is also ensured to build successfully.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-05-20 01:33:18 +03:00
Avi Kivity
b1f9df279a Merge "Untie cdc, storage service and migration notifier knot" from Pavel E
"
Storage service needs migration notifier reference to pass it to cdc
service via get_local_storage_service(). This set removes

- get_local_storage_service from cdc
- migration notifier from storage service
- db_context::builder from cdc (released nuclear binding energy)

tests: unit(dev)
"

* 'br-cdc-no-storage-service' of https://github.com/xemul/scylla:
  storage_service: Remove migration notifier dependency
  cdc: Remove db_context::builder
  cdc: Provide migration notifier right at once
  cdc: Remove db_context::builder::with_migration_notifier
2021-05-11 18:39:10 +03:00
Piotr Grabowski
cd6154e8bf cdc: log: assert post_image is always in full mode
Add an assertion that checks that post_image can never be in non-full
mode.
2021-05-04 12:33:15 +02:00
Pavel Emelyanov
cc813ef0dd cdc: Remove db_context::builder
Right now the builder is just an opaque transfer between cdc_service
constructor args and cdc_service's db_context constructor args.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-29 22:46:57 +03:00
Pavel Emelyanov
3a7ca647af cdc: Provide migration notifier right at once
The only way db_context's migration notifier reference is set up
is via cdc_service->db_context::builder->.build chain of calls.
Since the builder's notifier optional reference is always
disengaged (the .with_migration_notifier is removed by previous
patch) the only possible notifier reference there is from the
storage service which, in turn, is the same as in main.cc.

Said that -- push the notifier reference onto db_context directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-29 22:40:24 +03:00
Pavel Emelyanov
421a514c30 cdc: Remove db_context::builder::with_migration_notifier
It's unused and removing it makes next patch's life simpler

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-29 22:39:12 +03:00
Piotr Grabowski
b1650114eb cdc: log: fill cdc$deleted_ columns in pre-images
Before this change, cdc$deleted_ columns were all NULL in pre-images.
Lack of such information made it hard to correctly interpret the
pre-image rows, for example:

INSERT INTO tbl(pk, ck, v, v2) VALUES (1, 1, null, 1);
INSERT INTO tbl(pk, ck, v2) VALUES (1, 1, 1);

For this example, pre-image generated for the second operation
would look like this (in both 'true' and 'full' pre-image mode):

pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1

v=NULL has two meanings:
1. If pre-image was in 'true' mode, v=NULL describes that v was not
affected (affected columns: pk, ck, v2).
2. If pre-image was in 'full' mode, v=NULL describes that v was equal
to NULL in the pre-image.

Therefore, to properly decode pre-images you would need to know in
which mode pre-image was configured on the CDC-enabled table at the
moment this CDC log row was inserted. There is no way to determine
such information (you can only check a current mode of pre-image).

A solution to this problem is to fill in the cdc$deleted_ columns
for pre-images. After this change, for the INSERT described above, CDC
now generates the following log row:

If in pre-image 'true' mode:
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1

If in pre-image 'full' mode:
pk=1, ck=1, v=NULL, cdc$deleted_v=true, v2=1

A client library now can properly decode a pre-image row. If it sees
a NULL value, it can now check the cdc$deleted_ column to determine
if this NULL value was a part of pre-image or it was omitted due to
not being an affected column in the delta operation.

No such change is necessary for the post-image rows, as those images
are always generated in the 'full' mode.

Additional example of trouble decoding pre-images before this change.
tbl2 - 'true' pre-image mode, tbl3 - 'full' pre-image mode:

INSERT INTO tbl2(pk, ck, v, v2) VALUES (1, 1, 5, 1);
INSERT INTO tbl3(pk, ck, v, v2) VALUES (1, 1, null, 1);

INSERT INTO tbl2(pk, ck, v2) VALUES (1, 1, 1);

generated pre-image:
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1

INSERT INTO tbl3(pk, ck, v2) VALUES (1, 1, 1);

generated pre-image:
pk=1, ck=1, v=NULL, cdc$deleted_v=NULL, v2=1

Both pre-images look the same, but:
1. v=NULL in tbl2 describes v being omitted from the pre-image.
2. v=NULL in tbl3 described v being NULL in the pre-image.
2021-04-29 18:04:07 +02:00