scylladb

Author	SHA1	Message	Date
Karol Nowacki	ca62effdd2	vector_search: Restrict vector index tests to tablets only Vector indexes are going to be supported only for tablets (see VECTOR-322). As a result, tests using vector indexes will be failing when run with vnodes. This change ensures tests using vector indexes run exclusively with tablets. Fixes: VECTOR-49 Closes scylladb/scylladb#26843	2025-11-25 09:26:16 +02:00
Michael Litvak	e7dbccd59e	cdc: use chunked_vector instead of vector for stream ids use utils::chunked_vector instead of std::vector to store cdc stream sets for tablets. a cdc stream set usually represents all streams for a specific table and timestamp, and has a stream id per each tablet of the table. each stream id is represented by 16 bytes. thus the vector could require quite large contiguous allocations for a table that has many tablets. change it to chunked_vector to avoid large contiguous allocations. Fixes scylladb/scylladb#26791 Closes scylladb/scylladb#26792	2025-10-31 13:02:34 +01:00
Piotr Wieczorek	2812e67f47	cdc: Emit a preimage for non-clustered tables Until this patch, CDC haven't fetched a preimage for mutations containing only a partition tombstone. Therefore, single-row deletions in a table witout a clustering key didn't include a preimage, which was inconsistent with single-row clustered deletions. This commit addresses this inconsistency. Second reason is compatibility with DynamoDB Streams, which doesn't support entire-partition deletes. Alternator uses partition tombstones for single-row deletions, though, and in these cases the 'OldImage' was missing from REMOVE records. Fixes https://github.com/scylladb/scylladb/issues/26382 Closes scylladb/scylladb#26578	2025-10-29 17:54:58 +02:00
Michael Litvak	440caeabcb	cdc: helpers for garbage collecting old streams for tablets introduce helper functions that can be used for garbage collecting old cdc streams for tablets-based keyspaces. - get_new_base_for_gc: finds a new base timestamp given a TTL, such that all older timestamps and streams can be removed. - get_cdc_stream_gc_mutations: given new base timestamp and streams, builds mutations that update the internal cdc tables and remove the older streams. - garbage_collect_cdc_streams_for_table: combines the two functions above to find a new base and build mutations to update it for a specific table - garbage_collect_cdc_streams: builds gc mutations for all cdc tables	2025-10-26 11:01:20 +01:00
Michael Litvak	67410cac4d	cdc: generate_stream_diff helper function This helper functions receives two sets of streams and constructs their difference - closed and opened streams.	2025-09-17 14:47:12 +02:00
Michael Litvak	9ec4b6ccb1	cdc: load tablet streams metadata from tables Read the CDC stream metadata from the internal system tables, and store it in the cdc metadata data structures. The metadata is stored in the tables as diffs which is more storage efficient, but when in-memory we store it as full stream sets for each timestamp. This is more useful because we need to be able to find a stream given timestamp and token.	2025-09-17 14:47:12 +02:00
Dawid Pawlik	61c7b935e1	test/boost: adjust CDC boost tests for Vector Search Adjust name conflict and permissions tests when enabling CDC for Vector Search. Add test that checks if CDC with vector column is setup properly.	2025-08-20 17:20:37 +02:00
Dawid Mędrek	20d0050f4e	cdc: Forbid altering columns of CDC log tables directly The set of columns of a CDC log table should be managed automatically by Scylla, and the user should not have the ability to manipulate them directly. That could lead to disastrous consequences such as a segmentation fault. In this commit, we're restricting those operations. We also provide two validation tests. One of the existing tests had to be adjusted as it modified the type of a column in a CDC log table. Since the test simply verifies that the user has sufficient permissions to perform `ALTER TABLE` on the log table, the test is still valid. Fixes scylladb/scylladb#24643	2025-07-16 15:35:48 +02:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Kefu Chai	d0a3311ced	locator: do not include unused headers these unused includes were identifier by clang-include-cleaner. after auditing these source files, all of the reports have been confirmed. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22199	2025-01-08 14:26:48 +02:00
Takuya ASADA	03461d6a54	test: compile unit tests into a single executable To reduce test executable size and speed up compilation time, compile unit tests into a single executable. Here is a file size comparison of the unit test executable: - Before applying the patch $ du -h --exclude='.o' --exclude='.o.d' build/release/test/boost/ build/debug/test/boost/ 11G build/release/test/boost/ 29G build/debug/test/boost/ - After applying the patch du -h --exclude='.o' --exclude='.o.d' build/release/test/boost/ build/debug/test/boost/ 5.5G build/release/test/boost/ 19G build/debug/test/boost/ It reduces executable sizes 5.5GB on release, and 10GB on debug. Closes #9155 Closes scylladb/scylladb#21443	2024-12-22 19:14:09 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	48c8d24345	treewide: drop support for fmt < v10 since fedora 38 is EOL. and fedora 39 comes with fmt v10.0.0, also, we've switched to the build image based on fedora 40, which ships fmt-devel v10.2.1, there is no need to support fmt < 10. in this change, we drop the support fmt < 10. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21847	2024-12-09 20:42:38 +02:00
Kefu Chai	bab12e3a98	treewide: migrate from boost::adaptors::transformed to std::views::transform now that we are allowed to use C++23. we now have the luxury of using `std::views::transform`. in this change, we: - replace `boost::adaptors::transformed` with `std::views::transform` - use `fmt::join()` when appropriate where `boost::algorithm::join()` is not applicable to a range view returned by `std::view::transform`. - use `std::ranges::fold_left()` to accumulate the range returned by `std::view::transform` - use `std::ranges::fold_left()` to get the maximum element in the range returned by `std::view::transform` - use `std::ranges::min()` to get the minimal element in the range returned by `std::view::transform` - use `std::ranges::equal()` to compare the range views returned by `std::view::transform` - remove unused `#include <boost/range/adaptor/transformed.hpp>` - use `std::ranges::subrange()` instead of `boost::make_iterator_range()`, to feed `std::views::transform()` a view range. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. limitations: there are still a couple places where we are still using `boost::adaptors::transformed` due to the lack of a C++23 alternative for `boost::join()` and `boost::adaptors::uniqued`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21700	2024-12-03 09:41:32 +02:00
Kefu Chai	24d14b601b	treewide: s/boost::adaptors::map_values/std::views::values/ now that we are allowed to use C++23. we now have the luxury of using `std::views::values`. in this change, we: - replace `boost::adaptors::map_values` with `std::views::values` - update affected code to work with `std::views::values` - the places where we use `boost::join()` are not changed, because we cannot use `std::views::concat` yet. this helper is only available in C++26. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21265	2024-10-27 21:32:45 +02:00
Kefu Chai	5cd619a60c	treewide: s/boost::adaptors::map_keys/std::views::keys/ now that we are allowed to use C++23. we now have the luxury of using `std::views::keys`. in this change, we: - replace `boost::adaptors::map_keys` with `std::views::keys` - update affected code to work with `std::views::keys` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21198	2024-10-21 12:47:52 +03:00
Kefu Chai	3e84d43f93	treewide: use seastar::format() or fmt::format() explicitly before this change, we rely on `using namespace seastar` to use `seastar::format()` without qualifying the `format()` with its namespace. this works fine until we changed the parameter type of format string `seastar::format()` from `const char*` to `fmt::format_string<...>`. this change practically invited `seastar::format()` to the club of `std::format()` and `fmt::format()`, where all members accept a templated parameter as its `fmt` parameter. and `seastar::format()` is not the best candidate anymore. despite that argument-dependent lookup (ADT for short) favors the function which is in the same namespace as its parameter, but `using namespace` makes `seastar::format()` more competitive, so both `std::format()` and `seastar::format()` are considered as the condidates. that is what is happening scylladb in quite a few caller sites of `format()`, hence ADT is not able to tell which function the winner in the name lookup: ``` /__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous 265 \| return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id()); \| ^~~~~~ /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 4290 \| format(format_string<_Args...> __fmt, _Args&&... __args) \| ^ /__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 143 \| format(fmt::format_string<A...> fmt, A&&... a) { \| ^ ``` in this change, we change all `format()` to either `fmt::format()` or `seastar::format()` with following rules: - if the caller expects an `sstring` or `std::string_view`, change to `seastar::format()` - if the caller expects an `std::string`, change to `fmt::format()`. because, `sstring::operator std::basic_string` would incur a deep copy. we will need another change to enable scylladb to compile with the latest seastar. namely, to pass the format string as a templated parameter down to helper functions which format their parameters. to miminize the scope of this change, let's include that change when bumping up the seastar submodule. as that change will depend on the seastar change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-09-11 23:21:40 +03:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Kefu Chai	ef0f4eaef2	test: do not use operator<< for std::optional we don't provide it anymore, and if any of existing type provides constructor accepting an `optional<>`, and hence can be formatted using operator<< after converting it, neither shall we rely on this behavior, as it is fragile. so, in this change, we switch to `fmt::print()` to use {fmt} to print `optional<>`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-06-18 10:41:48 +08:00
Kefu Chai	372a4d1b79	treewide: do not define FMT_DEPRECATED_OSTREAM since we do not rely on FMT_DEPRECATED_OSTREAM to define the fmt::formatter for us anymore, let's stop defining `FMT_DEPRECATED_OSTREAM`. in this change, * utils: drop the range formatters in to_string.hh and to_string.c, as we don't use them anymore. and the tests for them in test/boost/string_format_test.cc are removed accordingly. * utils: use fmt to print chunk_vector and small_vector. as we are not able to print the elements using operator<< anymore after switching to {fmt} formatters. * test/boost: specialize fmt::details::is_std_string_like<bytes> due to a bug in {fmt} v9, {fmt} fails to format a range whose element type is `basic_sstring<uint8_t>`, as it considers it as a string-like type, but `basic_sstring<uint8_t>`'s char type is signed char, not char. this issue does not exist in {fmt} v10, so, in this change, we add a workaround to explicitly specialize the type trait to assure that {fmt} format this type using its `fmt::formatter` specialization instead of trying to format it as a string. also, {fmt}'s generic ranges formatter calls the pair formatter's `set_brackets()` and `set_separator()` methods when printing the range, but operator<< based formatter does not provide these method, we have to include this change in the change switching to {fmt}, otherwise the change specializing `fmt::details::is_std_string_like<bytes>` won't compile. * test/boost: in tests, we use `BOOST_REQUIRE_EQUAL()` and its friends for comparing values. but without the operator<< based formatters, Boost.Test would not be able to print them. after removing the homebrew formatters, we need to use the generic `boost_test_print_type()` helper to do this job. so we are including `test_utils.hh` in tests so that we can print the formattable types. * treewide: add "#include "utils/to_string.hh" where `fmt::formatter<optional<>>` is used. * configure.py: do not define FMT_DEPRECATED_OSTREAM * cmake: do not define FMT_DEPRECATED_OSTREAM Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:57:36 +08:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Kefu Chai	97587a2ea4	test/boost: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17139	2024-02-06 13:22:16 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Pavel Emelyanov	64c8a59e9b	test: Open-code ks.cf name parse into cdc_test The test uses qualified ks.cf name to find the schema, but it's the only test case that does it. There's no point in maintaining a dedicated helper on the cql_test_env just for that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 11:46:36 +03:00
Pavel Emelyanov	b4c84f9174	test: Use BOOST_REQUIRE(!db.has_schema()) Surprisingly there's a dedicated helper for the check opposite to the one fixed in the previous patch. Fix one too Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 11:46:36 +03:00
Pavel Emelyanov	063baabaee	test: Use BOOST_REQUIRE(db.has_schema()) Same as in previous patch, the cql_test_env::require_table_exists() helper is exactly the same, but returns future and asserts on failures for no gain Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 11:46:32 +03:00
Avi Kivity	42a1ced73b	cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt The expression system uses managed_bytes_opt for values, but result_set uses bytes_opt. This means that processing values from the result set in expressions requires a copy. Out of the two, managed_bytes_opt is the better choice, since it prevents large contiguous allocations for large blobs. So we switch result_set to use managed_bytes_opt. Users of the result_set API are adjusted. The db::function interface is not modified to limit churn; instead we convert the types on entry and exit. This will be adjusted in a following patch.	2023-05-07 17:17:36 +03:00
Kefu Chai	df63e2ba27	types: move types.{cc,hh} into types they are part of the CQL type system, and are "closer" to types. let's move them into "types" directory. the building systems are updated accordingly. the source files referencing `types.hh` were updated using following command: ``` find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} + ``` the source files under sstables include "types.hh", which is indeed the one located under "sstables", so include "sstables/types.hh" instea, so it's more explicit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12926	2023-02-19 21:05:45 +02:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Raphael S. Carvalho	3c5afb2d5c	test: Enable Scylla test command line options for boost tests We have enabled the command line options without changing a single line of code, we only had to replace old include with scylla_test_case.hh. Next step is to add x-log-compaction-groups options, which will determine the number of compaction groups to be used by all instantiations of replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Kamil Braun	3be376f6c5	test/boost: cdc_test: remove test_cdc_across_shards The test checked if creating a table with CDC enabled on shard other than 0 would create the CDC log table as well; it was a regression test for #5582. However we will soon bounce all schema change requests to shard 0, so the test's purpose is gone. I need to remove this test because `cquery_nofail` does not handle the bouncing correctly: it silently accepts the bounce message, assumes that the query was successful and returns. So after we change the code to start bouncing all requests to shard 0, if a query was ran inside test code using `cquery_nofail` on a shard different than 0 it would do nothing and following queries executed on shard 0 would fail because they depended on the effect of the aforementioned query.	2022-06-23 16:14:41 +02:00
Calle Wilund	adda43edc7	CDC - do not remove log table on CDC disable Fixes #10489 Killing the CDC log table on CDC disable is unhelpful in many ways, partly because it can cause random exceptions on nodes trying to do a CDC-enabled write at the same time as log table is dropped, but also because it makes it impossible to collect data generated before CDC was turned off, but which is not yet consumed. Since data should be TTL:ed anyway, retaining the table should not really add any overhead beyond the compaction to eventually clear it. And user did set TTL=0 (disabled), then he is already responsible for clearing out the data. This also has the nice feature of meshing with the alternator streams semantics. Closes #10601	2022-05-31 19:07:07 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	2d25705db0	cql3: deinline non-trivial methods in selection.hh This allows us to forward-declare raw_selector, which in turn reduces indirect inclusions of expression.hh from 147 to 58, reducing rebuilds when anything in that area changes. Includes that were lost due to the change are restored in individual translation units. Closes #9434	2021-10-05 12:58:55 +02:00
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Pavel Solodovnikov	76bea23174	treewide: reduce header interdependencies Use forward declarations wherever possible. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Closes #8813	2021-06-07 15:58:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Kamil Braun	c948573398	sys_dist_ks: don't create old CDC generations table on service initialization The old table won't be created in clusters that are bootstrapped after this commit. It will stay in clusters that were upgraded from a version before this commit. Note that a fully upgraded cluster doesn't automatically create a new generation in the new format. Even if the last generation was created before the upgrade, the cluster will keep using it. A new generation will be created in the new format when either: 1. a new node bootstraps (in the new version), 2. or the user runs checkAndRepairCdcStreams, which has a new check: if the current generation uses the old format, the command will decide that repair is needed, even if the generation is completely fine otherwise (also in the new version). During upgrade, while the CDC_GENERATIONS_V2 feature is still not enabled, the user may still bootstrap a node in the old version of Scylla or run checkAndRepairCdcStreams on a not-yet-upgraded node. In that case a new generation will be created in the old format, using the old table definitions.	2021-05-25 16:07:23 +02:00
Kamil Braun	f25e77c202	test: cdc: include new generations table in permissions test	2021-05-25 16:07:23 +02:00
Piotr Grabowski	778fbb144f	cdc: tests: check cdc$deleted_ columns in images Add a test that checks whether the cdc$deleted_ columns are properly filled in the pre/post-image rows. This test checks tables with only atomic columns, tables with frozen collections and non-frozen collections. The test is performed with both 'true' pre-image mode and 'full' pre-image mode.	2021-05-04 12:33:15 +02:00
Kamil Braun	67d4e5576d	sys_dist_ks: split CDC streams table partitions into clustered rows Until now, the lists of streams in the `cdc_streams_descriptions` table for a given generation were stored in a single collection. This solution has multiple problems when dealing with large clusters (which produce large lists of streams): 1. large allocations 2. reactor stalls 3. mutations too large to even fit in commitlog segments This commit changes the schema of the table as described in issue #7993. The streams are grouped according to token ranges, each token range being represented by a separate clustering row. Rows are inserted in reasonably large batches for efficiency. The table is renamed to enable easy upgrade. On upgrade, the latest CDC generation's list of streams will be (re-)inserted into the new table. Yet another table is added: one that contains only the generation timestamps clustered in a single partition. This makes it easy for CDC clients to learn about new generations. It also enables an elegant two-phase insertion procedure of the generation description: first we insert the streams; only after ensuring that a quorum of replicas contains them, we insert the timestamp. Thus, if any client observes a timestamp in the timestamps table (even using a ONE query), it means that a quorum of replicas must contain the list of streams.	2021-02-18 11:44:59 +01:00
Kamil Braun	2da723b9c8	cdc: produce postimage when inserting with no regular columns When a row was inserted into a table with no regular columns, and no such row existed in the first place, postimage would not be produced. Fix this. Fixes #7716. Closes #7723	2020-12-01 18:01:23 +02:00
Piotr Jastrzebski	debd10cc55	cdc: Remove trailing whitespaces from cdc_tests The change was performed automatically using vim and :%s/\s\+$//e Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-19 16:25:22 +01:00
Piotr Jastrzebski	6bdbfbafb7	cdc: Remove mk_cdc_test_config from tests Now that CDC is GA and enabled by default, there's no longer a need for a specific config in CDC tests. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-19 16:21:32 +01:00
Piotr Jastrzebski	e9072542c1	Mark CDC as GA Enable CDC by default. Rename CDC experimental feature to UNUSED_CDC to keep accepting cdc flag. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-12 12:36:13 +01:00
Calle Wilund	46ea8c9b8b	cdc: Add an "end-of-record" column to Fixes #7435 Adds an "eor" (end-of-record) column to cdc log. This is non-null only on last-in-timestamp group rows, i.e. end of a singular source "event". A client can use this as a shortcut to knowing whether or not he has a full cdc "record" for a given source mutation (single row change). Closes #7436	2020-10-26 09:39:27 +02:00
Kamil Braun	ff78a3c332	cdc: rename CDC description tables... again Commit `a6ad70d3da` changed the format of stream IDs: the lower 8 bytes were previously generated randomly, now some of them have semantics. In particular, the least significant byte contains a version (stream IDs might evolve with further releases). This is a backward-incompatible change: the code won't properly handle stream IDs with all lower 8 bytes generated randomly. To protect us from subtle bugs, the code has an assertion that checks the stream ID's version. This means that if an experimental user used CDC before the change and then upgraded, they might hit the assertion when a node attempts to retrieve a CDC generation with old stream IDs from the CDC description tables and then decode it. In effect, the user won't even be able to start a node. Similarly as with the case described in `d89b7a0548`, the simplest fix is to rename the tables. This fix must get merged in before CDC goes out of experimental. Now, if the user upgrades their cluster from a pre-rename version, the node will simply complain that it can't obtain the CDC generation instead of preventing the cluster from working. The user will be able to use CDC after running checkAndRepairCDCStreams. Since a new table is added to the system_distributed keyspace, the cluster's schema has changed, so sstables and digests need to be regenerated for schema_digest_test.	2020-08-31 11:33:14 +03:00
Calle Wilund	e50911e5b0	cdc: Do not generate pre/post image for non-existent rows Fixes #7119 Fixes #7120 If preimage select came up empty - i.e. the row did not exist, either due to never been created, or once delete, we should not bother creating a log preimage row for it. Esp. since it makes it harder to interpret the cdc log. If an operation in a cdc batch did a row delete (ranged, ck, etc), do not generate postimage data, since the row does no longer exist. Note that we differentiate deleting all (non-pk/ck) columns from actual row delete.	2020-08-26 18:14:09 +00:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Calle Wilund	8cc5076033	cdc_test: Do small test of "full" Not a huge test change, but at least verifies it works.	2020-08-12 16:04:52 +00:00

1 2 3

101 Commits