scylladb

Author	SHA1	Message	Date
Piotr Wieczorek	a32e8091a9	alternator, cdc: Don't emit an event for equal items This commit adds a function that compares split mutations with the `row_state`, that was selected as a preimage or propagated through cdc options by a caller. If the items are equal, the corresponding log row isn't generated. The result being that creating an item with BatchWriteItem, PutItem, or UpdateItem doesn't emit an INSERT/MODIFY event if exactly identical item already exists. Comparing the items may be costly, so this logic is controlled by `alternator_streams_compabitiblity` flag. This commit handles the following cases: - `PutItem/UpdateItem/BatchWriteItem.PutItem of an existing and equal item: nothing`	2025-10-30 08:38:30 +01:00
Piotr Wieczorek	8c2f60f111	alternator/streams, cdc: Differentiate item replace and item update in CDC This commit improves compatibility with DynamoDB streams by changing the emitted events when creating/updating an item. Replace/update operations of an existing item emit a MODIFY, whereas replacing/updating a missing item results in an INSERT. If the state of the item doesn't change after applying the operation, no event is emitted. This commit handles the following cases: - `PutItem/UpdateItem/BatchWriteItem.PutItem of an existing and not equal item: MODIFY` - `PutItem/UpdateItem/BatchWriteItem.PutItem of a nonexistent item: INSERT` Refs https://github.com/scylladb/scylladb/issues/6918	2025-10-30 07:40:31 +01:00
Piotr Wieczorek	e3fde8087a	cdc: Don't split a row marker away from row cells CDC log table records a mutation as a sequence of log rows that record an atomic change (i.e. a row marker, tombstones, etc.), whereas a mutation in Alternator Streams always appears as a single log row. The type of operation is determined based on the type of the last log row in CDC. As a result, updates that create a row always appeared to Alternator Streams as an update (row marker + data), rather than an insert. This commit makes them a single log row. Its operation type is insert if it contains a row marker, and an update otherwise, which gives results consistent with DynamoDB Streams.	2025-10-30 07:40:31 +01:00
Piotr Wieczorek	a3ec6c7d1d	alternator/streams: Support userIdentity field for TTL deletions UserIdentity is a map of two fields in GetRecords responses, which always has the same value. It may be missing, or contain a constant object with value `{"type": "Service", "principalId": "dynamodb.amazonaws.com"}`. Currently, the latter is set only for `REMOVE`s triggered by TTL. This commit introduces two new CDC operation types: `service_row_delete` and `service_partition_delete`, emitted in place of `row_delete` and `partition_delete`. Alternator Streams treats them as regular `REMOVE`s, but in addition adds the `userIdentity` field to the record. This change may break existing Scylla libraries for reading raw CDC tables, but we doubt that anybody has this use case. Refs https://github.com/scylladb/scylladb/pull/26149 Refs https://github.com/scylladb/scylladb/pull/26121 Fixes https://github.com/scylladb/scylladb/issues/11523 Closes scylladb/scylladb#26460	2025-10-20 17:15:59 +02:00
Piotr Wieczorek	d4581cc442	cdc: Support prefetched preimages This commit adds support to pass a preimage selected by an upper layer to CDC. The responsibility for the correctness of the preimage (i.e. the selected columns, whether it's up to date, etc.) lies with the caller. It may be improved in the future by validating the preimage, e.g. by "slicing" the received preimage to the necessary columns. The motivation behind this change was to reduce the number of read-before-writes and avoid reading the row twice for Alternator Streams in an increased compatibility mode with DynamoDB. This is to be added in a following commit. Until now, this commit should be a no-op.	2025-10-14 07:29:07 +02:00
Piotr Wieczorek	2c1e699864	cdc, storage: Add a struct to pass per-mutation options to CDC This will allow us to communicate with CDC from higher layers. We plan to use it to reduce the number of read-before-writes with preimages by passing the row selected in upper layers.	2025-10-09 12:28:10 +02:00
Piotr Wieczorek	66935bedac	cdc: Move operations enum to the top of the namespace	2025-10-09 12:28:10 +02:00
Ernest Zaslavsky	5ba5aec1f8	treewide: Move mutation related files to a `mutation` directory As requested in #22104, moved the files and fixed other includes and build system. Moved files: - combine.hh - collection_mutation.hh - collection_mutation.cc - converting_mutation_partition_applier.hh - converting_mutation_partition_applier.cc - counters.hh - counters.cc - timestamp.hh Fixes: #22104 This is a cleanup, no need to backport Closes scylladb/scylladb#25085	2025-09-24 13:23:38 +03:00
Michael Litvak	daf200facb	cdc: add is_log_schema helper In few places we need to check whether a schema represents a CDC log table, and we do so by checking whether the table's partitioner is the CDC partitioner. Extract this logic to a new utility function to reduce code duplication and allow reuse.	2025-09-17 14:47:11 +02:00
Dawid Pawlik	af2a544395	cdc: enable CDC log when vector index is created Enable CDC log table when creating an index on vector column using 'vector_index' custom index class.	2025-08-20 12:38:52 +02:00
Benny Halevy	3feb759943	everywhere: use utils::chunked_vector for list of mutations Currently, we use std::vector<*mutation> to keep a list of mutations for processing. This can lead to large allocation, e.g. when the vector size is a function of the number of tables. Use a chunked vector instead to prevent oversized allocations. `perf-simple-query --smp 1` results obtained for fixed 400MHz frequency and PGO disabled: Before (read path): ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 89055.97 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39417 insns/op, 18003 cycles/op, 0 errors) 103372.72 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39380 insns/op, 17300 cycles/op, 0 errors) 98942.27 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39413 insns/op, 17336 cycles/op, 0 errors) 103752.93 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39407 insns/op, 17252 cycles/op, 0 errors) 102516.77 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39403 insns/op, 17288 cycles/op, 0 errors) throughput: mean= 99528.13 standard-deviation=6155.71 median= 102516.77 median-absolute-deviation=3844.59 maximum=103752.93 minimum=89055.97 instructions_per_op: mean= 39403.99 standard-deviation=14.25 median= 39406.75 median-absolute-deviation=9.30 maximum=39416.63 minimum=39380.39 cpu_cycles_per_op: mean= 17435.81 standard-deviation=318.24 median= 17300.40 median-absolute-deviation=147.59 maximum=18002.53 minimum=17251.75 ``` After (read path) ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 59755.04 tps ( 66.2 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39466 insns/op, 22834 cycles/op, 0 errors) 71854.16 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39417 insns/op, 17883 cycles/op, 0 errors) 82149.45 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 39411 insns/op, 17409 cycles/op, 0 errors) 49640.04 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.3 tasks/op, 39474 insns/op, 19975 cycles/op, 0 errors) 54963.22 tps ( 66.1 allocs/op, 0.0 logallocs/op, 14.3 tasks/op, 39474 insns/op, 18235 cycles/op, 0 errors) throughput: mean= 63672.38 standard-deviation=13195.12 median= 59755.04 median-absolute-deviation=8709.16 maximum=82149.45 minimum=49640.04 instructions_per_op: mean= 39448.38 standard-deviation=31.60 median= 39466.17 median-absolute-deviation=25.75 maximum=39474.12 minimum=39411.42 cpu_cycles_per_op: mean= 19267.01 standard-deviation=2217.03 median= 18234.80 median-absolute-deviation=1384.25 maximum=22834.26 minimum=17408.67 ``` `perf-simple-query --smp 1 --write` results obtained for fixed 400MHz frequency and PGO disabled: Before (write path): ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no} Disabling auto compaction 63736.96 tps ( 59.4 allocs/op, 16.4 logallocs/op, 14.3 tasks/op, 49667 insns/op, 19924 cycles/op, 0 errors) 64109.41 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 49992 insns/op, 20084 cycles/op, 0 errors) 56950.47 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50005 insns/op, 20501 cycles/op, 0 errors) 44858.42 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50014 insns/op, 21947 cycles/op, 0 errors) 28592.87 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50027 insns/op, 27659 cycles/op, 0 errors) throughput: mean= 51649.63 standard-deviation=15059.74 median= 56950.47 median-absolute-deviation=12087.33 maximum=64109.41 minimum=28592.87 instructions_per_op: mean= 49941.18 standard-deviation=153.76 median= 50005.24 median-absolute-deviation=73.01 maximum=50027.07 minimum=49667.05 cpu_cycles_per_op: mean= 22023.01 standard-deviation=3249.92 median= 20500.74 median-absolute-deviation=1938.76 maximum=27658.75 minimum=19924.32 ``` After (write path) ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no} Disabling auto compaction 53395.93 tps ( 59.4 allocs/op, 16.5 logallocs/op, 14.3 tasks/op, 50326 insns/op, 21252 cycles/op, 0 errors) 46527.83 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50704 insns/op, 21555 cycles/op, 0 errors) 55846.30 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50731 insns/op, 21060 cycles/op, 0 errors) 55669.30 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50735 insns/op, 21521 cycles/op, 0 errors) 52130.17 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 50757 insns/op, 21334 cycles/op, 0 errors) throughput: mean= 52713.91 standard-deviation=3795.38 median= 53395.93 median-absolute-deviation=2955.40 maximum=55846.30 minimum=46527.83 instructions_per_op: mean= 50650.57 standard-deviation=182.46 median= 50731.38 median-absolute-deviation=84.09 maximum=50756.62 minimum=50325.87 cpu_cycles_per_op: mean= 21344.42 standard-deviation=202.86 median= 21334.00 median-absolute-deviation=176.37 maximum=21554.61 minimum=21060.24 ``` Fixes #24815 Improvement for rare corner cases. No backport required Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#24919	2025-07-13 19:13:11 +03:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Nadav Har'El	e639434a89	change remaining sstring_view to std::string_view Our "sstring_view" is an historic alias for the standard std::string_view. The patch changes the last remaining random uses of this old alias across our source directory to the standard type name. After this patch, there are no more uses of the "sstring_view" alias. It will be removed in the following patch. Refs #4062. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-11-18 16:48:57 +02:00
Kefu Chai	6c06751640	cdc: not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16725	2024-01-11 09:13:37 +02:00
Petr Gusev	7b55ccbd8e	token_metadata: drop the template Replace token_metadata2 ->token_metadata, make token_metadata back non-template. No behavior changes, just compilation fixes.	2023-12-12 23:19:54 +04:00
Petr Gusev	63f64f3303	token_metadata: make it a template with NodeId=inet_address/host_id NodeId is used in all internal token_metadata data structures, that previously used inet_address. We choose topology::key_kind based on the value of the template parameter. generic_token_metadata::update_topology overload with host_id parameter is added to make update_topology_change_info work, it now uses NodeId as a parameter type. topology::remove_endpoint(host_id) is added to make generic_token_metadata::remove_endpoint(NodeId) work. pending_endpoints_for and endpoints_for_reading are just removed - they are not used and not implemented. The declarations were left by mistake from a refactoring in which these methods were moved to erm. generic_token_metadata_base is extracted to contain declarations, common to both token_metadata versions. Templates are explicitly instantiated inside token_metadata.cc, since implementation part is also a template and it's not exposed to the header. There are no other behavioral changes in this commit, just syntax fixes to make token_metadata a template.	2023-12-11 12:51:34 +04:00
Botond Dénes	f8a8fe41d6	cdc/log.hh: expose is_log_name() Allow outside code to use it to determine whether a table is cdc or not. This is currently the most reliable method if the custom partitioner is not set on the schema of the investigated table.	2022-06-10 10:57:12 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Pavel Emelyanov	0fd00d7016	cdc: Add database argument to is_log_for_some_table All callers has been patched already. This argument can now be used to replace get_local_storage_proxy().get_db().local() call. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-08-27 14:07:26 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Emelyanov	cc813ef0dd	cdc: Remove db_context::builder Right now the builder is just an opaque transfer between cdc_service constructor args and cdc_service's db_context constructor args. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-29 22:46:57 +03:00
Pavel Emelyanov	3a7ca647af	cdc: Provide migration notifier right at once The only way db_context's migration notifier reference is set up is via cdc_service->db_context::builder->.build chain of calls. Since the builder's notifier optional reference is always disengaged (the .with_migration_notifier is removed by previous patch) the only possible notifier reference there is from the storage service which, in turn, is the same as in main.cc. Said that -- push the notifier reference onto db_context directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-29 22:40:24 +03:00
Pavel Emelyanov	421a514c30	cdc: Remove db_context::builder::with_migration_notifier It's unused and removing it makes next patch's life simpler Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-04-29 22:39:12 +03:00
Kamil Braun	e2f03e4aba	cdc: move (most of) CDC generation management code to the new service Currently all management of CDC generations happens in storage_service, which is a big ball of mud that does many unrelated things. Previous commits have introduced a new service for managing CDC generations. This code moves most of the relevant code to this new service. However, some part still remains in storage_service: the bootstrap procedure, which happens inside storage_service, must also do some initialization regarding CDC generations, for example: on restart it must retrieve the latest known generation timestamp from disk; on bootstrap it must create a new generation and announce it to other nodes. The order of these operations w.r.t the rest of the startup procedure is important, hence the startup procedure is the only right place for them. Still, what remains in storage_service is a small part of the entire CDC generation management logic; most of it has been moved to the new service. This includes listening for generation changes and updating the data structures for performing CDC log writes (cdc::metadata). Furthermore these functions now return futures (and are internally coroutines), where previously they required a seastar::async context.	2021-02-26 12:06:12 +01:00
Benny Halevy	c60da2e90d	cdc: remove _token_metadata from db_context 1. It's unused since `cbe510d1b8` 2. It's unsafe to keep a reference to token_metadata& potentially across yield points. The higher-level motivation is to make storage_service::get_token_metadata() private so we can control better how it's used. For cdc, if the token_metadata is going to be needed to the future, it'd be better get it from db_context::_proxy.get_token_metadata_ptr(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20201213162351.52224-2-bhalevy@scylladb.com>	2020-12-13 18:32:17 +02:00
Benny Halevy	2f7c529c1c	storage_service: separate get_mutable_token_metadata Use a different getter for a token_metadata& that may be changed so we can better synchronize readers and writers of token_metadata and eventually allow them to yield in asynchronous loops. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-20 16:20:34 +03:00
Nadav Har'El	7e01ae089e	cdc: avoid including cdc/cdc_options.hh everywhere Before this patch, modifying cdc/cdc_options.hh required recompiling 264 source files. This is because this header file was included by a couple other header files - most notably schema.hh, where a forward declaration would have been enough. Only the handful of source files which really need to access the CDC options should include "cdc/cdc_options.hh" directly. After this patch, modifying cdc/cdc_options.hh requires only 6 source files to be recompiled. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200813070631.180192-1-nyh@scylladb.com>	2020-08-16 14:41:47 +03:00
Pavel Emelyanov	757a7145b9	headers: Remove mutation.hh from trace_state.hh Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:40:23 +03:00
Calle Wilund	331aa7c501	cdc: Add "is_cdc_metacolumn_name" predicate To sift column names	2020-07-15 08:10:23 +00:00
Calle Wilund	8a728ce618	cdc: Add get_base_table helper	2020-07-15 08:10:23 +00:00
Calle Wilund	8f462e8606	CDC::log: Add `base_name` helper To extract base table name from CDC log table name.	2020-07-15 08:10:23 +00:00
Kamil Braun	013330199d	cdc/storage_proxy: keep cdc_service alive in storage_proxy operations storage_proxy is never deinitialized, so it may have still used cdc_service after its destructor was called. This fixes the problem by cdc_service inheriting from async_sharded_service and storage_proxy calling shared_from_this on the service whenever it uses it. cdc_service inherits from async_sharded_service and not simply from enable_shared_from_this, because there might be other services that cdc_service depends on. Assuming that these services are deinitialized after cdc_service (as they should), i.e. after stop() is called on cdc_service, making cdc_service async_sharded_service will keep their deinitialization code from being called until all references to cdc_service disappear (async_sharded_service keeps stop() from returning until this happens). Some more improvements should be possible through some refactoring: 1. Make augment_mutation_call a free function, not a member of cdc_service: it doesn't need any state that cdc_service has. db_context can be passed down from storage_proxy when it calls the function. 2. Remove the storage_proxy -> cdc_service reference. storage_proxy only needs augment_mutation_call, which would not be a part of the service. This would also get rid of the proxy -> cdc -> proxy reference cycle that we have now, and would allow storage_proxy to be safely deinitialized after cdc_service. 3. Maybe we could even remove the cdc_service -> storage_proxy reference. Is it really needed?	2020-06-08 13:25:51 +03:00
Juliusz Stasiewicz	c70311f73e	cdc: CL for preimage select is calculated from base write CL CL of LOCAL_QUORUM used to be hardcoded into CDC preimage query and led to an error when number of replicas was lower than CL would require. The solution here is to link the CLs of writes to base table with the CLs of CDC reads, so the client will get the (limited) control over the consistency of preimage SELECTs (instead of getting error every time). The algorithm is as follows: 1. If write that caused CDC activity was done with CL = ANY, then do preimage read with CL = ONE. 2. If write that caused CDC activity was done with CL = ALL, then do preimage read with CL = QUORUM. 3. SERIAL and LOCAL_SERIAL writes cause preimage read with QUORUM and LOCAL_QUORUM, respectively. 4. In other cases do preimage read with the same CL as base write.	2020-04-21 14:33:36 +02:00
Piotr Dulikowski	5a5cc57878	cdc: create an operation_result_tracker object An `operation_result_tracker` object is now returned as a second return value from the `augment_mutation_call` function.	2020-03-23 14:05:25 +01:00
Piotr Dulikowski	59727fb34b	cdc: remove result_callback The `result_callback` was a callback returned by `augment_mutation_call` that was supposed to be used in the CDC postimage implementation. Because CDC postimage was implemented without using this callback, and currently a no-op function is always returned, this callback can safely be removed.	2020-03-19 14:55:07 +02:00
Nadav Har'El	35d95d6887	merge: Add postimage implementation Merged pull request https://github.com/scylladb/scylla/pull/5996 from Calle Wilund: Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains all columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 13:42:07 +02:00
Calle Wilund	0a3383c090	cdc: Add postimage implementation Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains _all_ columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 09:21:06 +00:00
Piotr Dulikowski	b1e8170bf9	cdc: add tracing Adds information about the stages of CDC mutation augmentation to tracing sessions.	2020-03-15 11:54:10 +01:00
Kamil Braun	3200d415da	cdc: use a single timeuuid value for a batch of changes If a batch update is performed with a sequence of changes with a single timestamp, they will now show up in CDC with a single timeuuid in the `time` column, distinguished by different `batch_seq_no` values. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-05 12:32:57 +01:00
Calle Wilund	ed0d1c5fe2	cdc: Break up data column tuple According to "new" spec: Data column is now pure frozen original type. If column is deleted (set to null), a metadata column cdc$deleted_<name> is set to true, to distinguish null column == not involved in row operation For non-atomic collections, a cdc$deleted_elements_<name> column is added, and when removing elements from collection this is where they are shown. For non-atomic assign, the "cdc$deleted_<name>" is true, and <name> is set to new value. column_op removed.	2020-03-03 08:52:20 +00:00
Calle Wilund	1085860c62	cdc: Rename metadata and data columns according to new spec Also use transformation methods for names in all code + tests to make switching again easier	2020-03-02 09:34:51 +00:00
Juliusz Stasiewicz	cf24ae86f3	cdc: distinguishing update from insert When incoming mutation contains live row marker the `operation` is described as "insert", not as an "update". Also, I extended the test case "test_row_delete" with one insert, which is expected to log different value of `operation` than update or delete. Renamed the test case accordingly. Test cases that relied on "update" being the same as "insert" are updated accordingly (`test_pre_image_logging`, `test_cdc_across_shards`, `test_add_columns`). Fixes #5723	2020-03-01 17:50:08 +02:00
Piotr Dulikowski	82a2bdf39f	cdc: distinguish open and closed ranges for range delete This patch causes inclusive and exclusive range deletes to be distinguished in cdc log. Previously, operations `range_delete_start` and `range_delete_end` were used for both inclusive and exclusive bounds in range deletes. Now, old operations were renamed to `range_delete__inclusive`, and for exclusive deletes, new operations `range_delete__exclusive` are used. Tests: unit(dev)	2020-02-20 11:39:06 +01:00
Piotr Dulikowski	6fe4f9ded8	cdc: restrict permissions on _scylla_cdc_log tables Disallows DROP permission on CDC log tables.	2020-02-10 15:40:48 +01:00
Piotr Jastrzebski	97262bec82	cdc: remove partitioner from db_context partitioner from cdc::db_context is no longer used so it can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-06 08:00:01 +01:00
Kamil Braun	bd42b10df1	cdc: rename cdc/cdc.{hh,cc} to cdc/log.{hh,cc} To increase modularity, making it easier to find what is where and maintain. The 'log' module (cdc/log.{hh,cc}) is responsible for updating CDC log tables when base table writes are performed. The 'generation' module (cdc/generation.{hh,cc}) handles stream generation changes in response to topology change events. cdc/metadata.{hh,cc} contains a helper class which holds the currently used generation of streams. It is used by both aforementioned modules: 'log' queries it, while 'generation' updates it.	2020-01-30 11:10:39 +01:00

47 Commits